Ranting, Technically Speaking

Tuesday, 20 April 2010

Launching Outlook Calendar

I wouldn't normally post about using an application, but nevertheless, this is a very handy trick for me. I usually run Thunderbird and Outlook Web Access (OWA). OWA is good for viewing your calendar, not so much shared calendars. And if I launch Outlook when Thunderbird is already running, Exchange goes crazy with my inbox. So, I often find I want to launch Outlook but only for the calendar. Microsoft has a handy page on how to Customize Outlook to start with the Calendar open. And in summary, you just need to add this to whatever shortcut you use to launch Outlook:

/select outlook:calendar

I added this to my quick launch link, the only downside is every time Outlook is updated, that link gets stomped.

But that's it.

- Arch

Monday, 29 March 2010

GUI Bad! SQL Good!

SQL Server (2005) hasn't been very kind to me lately. Among it's many faults, the one that cheesed me today was that in the course of testing a problem, I wanted to take one of the databases offline, do some stuff, and bring it online again. Well don't do this through the management studio GUI. When we did this, the offline process just hung there. According to Pinal Dave, the recommended way of doing things is like this:

ALTER DATABASE [mydb] SET OFFLINE WITH ROLLBACK AFTER 30 SECONDS

This way, if there's some wedged transactions (as were the source of our problems in the first place), this should rollback anything that doesn't finish in 30 seconds.

- Arch

Thursday, 25 February 2010

Nagios Agents (NRPE)

In an earlier post , I mentioned Nagios as a system monitoring tool. It's simple, it's flexible, and out of the box, you can monitor network services without any software installed on the monitored systems.

Now if you want to monitor other aspects of a system, like it's disk usage, you can either make that information generically visible on the network (say with SNMP) or you can install an agent for Nagios. The most common agent is NRPE.

Like everything else in Nagios, you first need a plugin for Nagios to be able to check nrpe and there's a standard package available called, well, check_nrpe. Use your package manager of choice to install this plugin (nagios-plugins-nrpe in Fedora). I found that although this installed the Nagios plugin, it did not create a command definition so I created one myself. First run the check_nrpe command manually to see what arguments it takes and then add your command definition to your Nagios configuration. It should look something like this:

# 'check-nrpe' command definition
define command{
        command_name    check_nrpe
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ $ARG2$
        }

The command definition specifies the name of the command and then simply it's invocation. The macros given ($USER1$, etc) are pretty generic and it's pretty easy to work from existing command definitions or the Nagios documentation.

Now once you get NRPE installed on a client, the service definition is going to look something like this:

define service{
        use                             generic-service
        host_name                       Hudson
        service_description             DISK_ROOT
        check_command                   check_nrpe!check_root
        }

You should be able to get the NRPE agent installed on many "Linux" distros from the package manager. The agent can either run under inetd (preferred) or as a stand-alone daemon. If you are using xinetd (which you should), make sure you specify the Nagios server in the only_from line, enable the service and then kick xinetd. Since you're using xinetd, basically all the service configuration is there leaving really only the command definitions in NRPE's main config file (/etc/nagios/nrpe.cfg). In the main config file, you are going to specify the commands that can be run. Here's the definition for the check_root command:

command[check_root]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /

As you can see, the command definition provides all the arguments needed such that the Nagios server should not ever have to pass any arguments to NRPE. This is for both safety and simplicity.

Now you're done! Reload your NRPE and Nagios processes and check back in a few minutes to ensure your service check is working. If it's not, typical issues are that the port is firewalled (TCP 5666 by default) or the Nagios host was not specified correctly in the only_from line (or the allowed_hosts line if not using a xinetd).

Next up is to monitor a Windows host. Since Microsoft doesn't have a convenient software repository of third-party applications, you get to go download and install an agent yourself. There are a handful of choices but generally, NSC++ (NSCP) will be the one you want. It supports a variety of protocols including NRPE and NSCA (NSCA is for submitting passive checks). When you install NSCP, the installer will let you enable NRPE and should handle setting up NRPE as a service and opening the firewall for it. The one thing you have to do is either enable external scripts (preferred) or enable arguments. There are a handful of stock scripts and aliases provided which get you most of the basic functionality like checking disk usage etc.

One last note is that you can always quickly check if the NRPE (or NSCP) process is talking to the server okay by simply running the check_nrpe plugin manually giving it only the host. It will report OK if NRPE is working or an error if it is not:

[root@alma nagios]# /usr/lib/nagios/plugins/check_nrpe -H hudson
Connection refused by host
[root@alma nagios]# /usr/lib/nagios/plugins/check_nrpe -H hudson
NRPE v2.12

- Arch

Tuesday, 26 January 2010

Essential Application Plugins

The nice thing about programs like Firefox and Thunderbird is that you can get a lot of community-created plugins to make the program look and do what you want. The downside of programs like Firefox and Thunderbird, is there is (at least for me) a few plugins that have to be installed before they work well. So to that end, I've started building up a list of essential plugins.

The plugin model isn't perfect, but it far exceeds the alternative which is that your applications all suck (Microsoft, I mean you). Heck, Nagios at the core doesn't do anything at all for you, it's all from plugins and I can't rave enough about how great an application Nagios is.

- Arch

Thursday, 31 December 2009

Fire Bad!

Battery backup at home went off today BEEEEEEEEEEEEEEEEEEEEEEP BEEEEEEEEEEEEEEEP! Everything shuts itself down and I go to reboot the UPS when *sniff* *sniff* ah yes, the distinctive smell of burned electronics. So that's it finally. Adios APC BackUPS 350. You will torture us no more with your intermittent failures!

Now I have to look for a new UPS. Preferrably small in size (it has to go under my desk) and monitored. APC's successor to the UPS model I had wasn't monitored last time I checked, maybe they've got a newer model that is though. If not, I'll have to look at other manufacturers and then that means looking at software support, etc.

Oh well, out with the old! Happy New Year's!

- Arch

Friday, 6 November 2009

Nagios Rules All

Nagios is a network monitoring application which itself provides no actual monitoring but rather specializes in scheduling checks and notifications. As a module framework, it works very well and there are a lot of monitoring plugins and all told, there aren't many (or any) systems that really compare, F/OSS, proprietary, or otherwise.

Since it's not a complete solution in and of itself, I know at least I found it a bit daunting to get in to. So I got this book:

Building a Monitoring Infrastructure with Nagios by David Josephsen

It's not a huge book like say, HP's OpenView manual(s), so read it first.

Nagios is super cool. You build definitions for each host on your network and each service on each host. Nagios checks each service recording the service's status. When a service fails, nagios will send a notification once it is sure the service is down and then periodically until it comes back up.

Fine, that's the basic premise. Now the configuration works pretty well because any host can inherit it's configuration from any other host definition including host definition templates. So set your general parameters once, and then override where necessary. It's the same for services. And you also have host and service groups which allow you logically group hosts (or services).

Nagios doesn't have any built-in way to check services, it's all through plugins. A plugin is simply an external script or program which exits with a status of 0 for OK, 1 for warning, or 2 for critical and optional 1 line of standard output for status text. Nagios has many standard plugins available, for example the check_ping plugin. This plugin is a little wrapper script which is invoked with arguments specifying the warning and critical thresholds for response time and packet loss. So in testing a plugin, you can simply invoke the plugin with the arguments that Nagios would be feeding it.

Now if a service goes down, Nagios will check if the host is down. Again, this is a plugin of the same type as for the service. Typically, this means check_ping. So you don't really need to have a check_ping "service" check, just for the host. So if your host runs a webserver, you would use check_http and if that fails, Nagios will check_ping on the host to see if that's down. If the host is down, well then obviously all services on that host are a write-off so Nagios will send a notification for that host once rather than for each individual service.

And when I say Nagios will send a notification, it doesn't know how to do that either. Notifications are also defined but typically the stock notification will suffice. On Fedora, it uses the "mail" program to send a mail message.

Ah, so who does it notify? Well, each service and each host defines contact groups and also contact hours. So Nagios will notify everyone in a contact group if it's during notification hours. So you can monitor your development systems as well as production ones and only get notifications when appropriate.

Nagios also provides escalations. So if a service (or host) remains down, you can define an escalation path. Maybe level one is help desk, and if they don't respond, it escalates to supervisors as well, and if they still don't respond, then it escalates to on-call staff, managers and eventually the head cheese.

What else is cool? Oh yeah, parent-child relationships. On each host, you can define parent hosts. So if you have say several routers throughout your network, connectivity to hosts would depend on connectivity to their routers. So if a router goes down, as with services on a host, Nagios will know to only notify of the router being down and not all the children individually.

There is also an agent for Nagios called the NRPE. It is totally optional but if you want local system checks, like disk, CPU, checking running processes, and not just network service checks, then NRPE lets you do this. Install NRPE on your monitored hosts, and NRPE is available for "Linux" and Windows, and it, I think, is like a little baby Nagios invoked by the mothership. So you install service check plugins with the NRPE and then on the server, your service checks are like check_nrpe!check_disk ... or something like that so the server sends the service check to the NRPE on the monitored system. I haven't used this yet, but will definitely be doing so.

The NEB is another cool part of Nagios. The Nagios Event Broker is an interface where can write programs which hook into Nagios's regular operations. There's a couple dozen callback functions you can hook into and this makes the possibilities for Nagios virtually endless.

The part I've left for last is the user interface. Well, once again, there is none. You configure it, it fires off notifications, that's the core. There is a web interface Nagios provides you can use if you want. It will show you host and service status and you can acknowledge alarms through it and schedule downtime for hosts. Now you can hook in more functionality, for example, historical graphs can be very useful. If you're checking disk usage with Nagios, why not keep a record of it? Well, when you get a service check result back, it comes back with one line of text from standard output, right? Well, there's packages that will build graphs from this data so that you can have your service status and historical reports too! Josephsen has a pretty extensive discussion on doing this kind of stuff and some great info on some of the options out there.

So, yeah. Get that book! Use Nagios! Monitor everything with it! Let it tell you when your toast is toasted or your beer needs a refill!

- Arch

Thursday, 1 October 2009

Fedora Bootable USB

LiveUSB Creator, it's a wonderful thing. Connect a USB key, get the LiveUSB Creator on your PC (Windows or "Linux"), point it either to a local .iso file for a Fedora live CD or let it download the version you want for you, click go, and shazzam! (yes, "shazzam") You've now got a bootable Fedora USB key. And if you gave it a block of persistent storage, you've got, well, persistent storage to use in this OS for data files etc.

- Arch