Alia, the latest in the line of servers hosted at home has one less service to host today. I've sacked the DNS service which had in the past provided primary DNS for some the public domains I had used. However, those are all now hosted by the DNS providers. I cleaned up the Bind configuration and closed that port so that it no longer forwards in from the Internet.
The last thing it was doing was DNS for local LAN - the internal DNS to lookup the printer (mostly). This is easily handled by DNSMasq in DD-WRT which is basically a tick-box to replace everything that Alia was doing for DNS. And it automatically adds the lookups for statically configured DHCP hosts so I don't have to setup a host once on the router for DHCP and then again on Alia for DNS.
At this point, it looks like Alia will be the last server I host at home. I've offloaded Jabber and now DNS leaving SMTP and HTTP. SMTP is almost ready to go already as there's only one personal domain for one user using that and that user may retire the domain otherwise we can move to Google Apps along with the other email. And that will leave HTTP which, since I can get shared hosting for less than $2 a month, is an easy one to offload. Not free, but shutting off Alia, even as an energy efficient system (low-power CPU and everything), will save just over $2 / mo in electricity consumption.
So we're coming to the end of an era. It really goes to show just how greatly improved hosted services are today and also the breadth of features you can get from consumer products for home. To have all the trappings of a full network that is so easy to use and so cheap, it is really amazing.
Sunday, 26 December 2010
Monday, 6 December 2010
Patch Your #$%^!
According to SANS, the top security threat right now is *drum roll* unpatched applications! *gasp* *shock* Yes, it's blindingly obviously, but organizations (and individuals) are downright negligent in patching desktop applications. Applications that are highly targeted, again no surprise here, Adobe Flash, Adobe Acrobat Reader, Apple Quicktime, and Microsoft Office. And furthermore, "On average, major organizations take at least twice as long to patch client-side vulnerabilities as they take to patch operating system vulnerabilities. In other words the highest priority risk is getting less attention than the lower priority risk."
So patch your #$%^ or else Walter is going to come beat the #$%^ out of your new car while shouting "This is what happens when you find a stranger in the Alps!" .
Or block Flash, Acrobat Reader, and Quicktime - can't say I'd shed any tears for those apps myself ;)
So patch your #$%^ or else Walter is going to come beat the #$%^ out of your new car while shouting "This is what happens when you find a stranger in the Alps!" .
Or block Flash, Acrobat Reader, and Quicktime - can't say I'd shed any tears for those apps myself ;)
Saturday, 13 November 2010
Disk management with Logical Volume Manager (LVM)
There is a lot of documentation on how to use Logical Volume Manager (LVM) Online but I'd like to just go over how I've been using LVM to illustrate some of the strengths and weaknesses.
The initial driving issue which made LVM a killer app was for handling large disks. This one system had an older SCSI RAID attached which only supported 2TB drives (a limitation of 32bit LBA, I think) but the sum of the disks (14 x 300GB) was, well, bigger. The equipment basically let me carve the array into 2TB disks. Using LVM, I can add those Physical Volumes (PVs) to a Volume Group (VG) and create Logical Volumes (LV) of any size desired including, ultimately, the total capacity of the RAID.
Another great feature of LVM is snapshots. Generally, a snapshot means you get a temporally fixed view of the file system for special purposes while general use continues unimpeded by storing the subsequent changes separately. So I can take a snapshot and then backup the snapshot which will assure that the filesystem (in the snapshot) is consistent from the time the backup starts to the time the backup finishes. Snapshots can also be used as a facility to simply roll-back files to a previous state. For example, I take a snapshot, run a test application which modifies a file, then restore that file from the snapshot to revert back.
However, LVM snapshots aren't as elegant as they are on some platforms. To create a snapshot, you must first have some unallocated space in your VG. You then allocate that space to the "snapshot" where disk changes since the snapshot can be stored. The bummer, man, is that this is a fixed amount of space you have to have on-hand and if it fills up, your "snapshot" device fails and if you had say a long backup running, you have to restart that backup. Even with this limitation, however, snapshots are still pretty useful. You can sortof figure out what the minimum size you need for a snapshot and ultimately, if you have snapshot space equal to the live system space, you're snapshot will never fill up.
The last feature I'd like to rant about is Online filesystem resizing. Now this is just absolutely great and very useful especially in concert with handling large volumes and managing snapshots. First of all, if you have a hardware RAID controller which lets you add drives and expand existing arrays as an Online operation, LVM is the layer which will let you expand your volumes to suit. There's two ways of doing this and first is to expand an existing block device (e.g. grow your sda from 1TB to 1.5TB) and you have to do this by modifying the partition table. This is slightly tricky but can be done online. The other way is by adding additional devices. Some RAID controllers (good ones) would let you add a second "logical disk" (or "virtual disk" depending on your vendor's jargon). If you add that additional disk, you simply initialize it as a new PV, add it to your VG and then add whatever you want to your LV.
Take the first example I had where the equipment would only allow 2TB devices. So first, you put all your disks in an array, and because you've got a lot of disks, maybe reserve 1 as a hot spare. So your total capacity is (14 disks - 1 hot spare - 1 for RAID-5 parity) * 300GB = 3600GB. You carve out your first LD and it's 2TB and appears in the OS as /dev/sda. Now generally, you should be putting a partition on your drives, to my knowledge, it's not required, but generally accepted that most disk applications will behave saner if they see a partition. Anyhow, so you've got /dev/sda1, so you initialize it (pvcreate /dev/sda1), you create a volume group (vgcreate myvgblah /dev/sda1), and you spin out your first LV (lvcreate -l 100%FREE -n mylv myvgblah). Hooray, you create your filesystem (mke2fs -j -L bigfs /dev/myvgblah/mylv) and mount it for regular use. Now sometime later you fill up that 2TB and realize that there's a pile of unused space. Well, you carve out another LD with the remaining 1.6TB which appears to the OS as /dev/sdb. Generally, I would expect this device to just show up, no rebooting or any crap like that. So you throw a partition on there, initialize the PV (pvcreate /dev/sdb1), add it to the existing volume group (pvextend myvgblah /dev/sdb1). With this free space, you can either add it all (lvextend -l 100%FREE /dev/myvgblah/mylv) or you could add it incrementally (lvextend -L +100G /dev/myvgblah/mylv) reserving free space for snapshots, additional LVs, and future growth.
Very handy to have all your disks in a pool (your VG) and be able to add logical drives (LVs), snapshot your drives, and incrementally expand your drives.
- Arch
The initial driving issue which made LVM a killer app was for handling large disks. This one system had an older SCSI RAID attached which only supported 2TB drives (a limitation of 32bit LBA, I think) but the sum of the disks (14 x 300GB) was, well, bigger. The equipment basically let me carve the array into 2TB disks. Using LVM, I can add those Physical Volumes (PVs) to a Volume Group (VG) and create Logical Volumes (LV) of any size desired including, ultimately, the total capacity of the RAID.
Another great feature of LVM is snapshots. Generally, a snapshot means you get a temporally fixed view of the file system for special purposes while general use continues unimpeded by storing the subsequent changes separately. So I can take a snapshot and then backup the snapshot which will assure that the filesystem (in the snapshot) is consistent from the time the backup starts to the time the backup finishes. Snapshots can also be used as a facility to simply roll-back files to a previous state. For example, I take a snapshot, run a test application which modifies a file, then restore that file from the snapshot to revert back.
However, LVM snapshots aren't as elegant as they are on some platforms. To create a snapshot, you must first have some unallocated space in your VG. You then allocate that space to the "snapshot" where disk changes since the snapshot can be stored. The bummer, man, is that this is a fixed amount of space you have to have on-hand and if it fills up, your "snapshot" device fails and if you had say a long backup running, you have to restart that backup. Even with this limitation, however, snapshots are still pretty useful. You can sortof figure out what the minimum size you need for a snapshot and ultimately, if you have snapshot space equal to the live system space, you're snapshot will never fill up.
The last feature I'd like to rant about is Online filesystem resizing. Now this is just absolutely great and very useful especially in concert with handling large volumes and managing snapshots. First of all, if you have a hardware RAID controller which lets you add drives and expand existing arrays as an Online operation, LVM is the layer which will let you expand your volumes to suit. There's two ways of doing this and first is to expand an existing block device (e.g. grow your sda from 1TB to 1.5TB) and you have to do this by modifying the partition table. This is slightly tricky but can be done online. The other way is by adding additional devices. Some RAID controllers (good ones) would let you add a second "logical disk" (or "virtual disk" depending on your vendor's jargon). If you add that additional disk, you simply initialize it as a new PV, add it to your VG and then add whatever you want to your LV.
Take the first example I had where the equipment would only allow 2TB devices. So first, you put all your disks in an array, and because you've got a lot of disks, maybe reserve 1 as a hot spare. So your total capacity is (14 disks - 1 hot spare - 1 for RAID-5 parity) * 300GB = 3600GB. You carve out your first LD and it's 2TB and appears in the OS as /dev/sda. Now generally, you should be putting a partition on your drives, to my knowledge, it's not required, but generally accepted that most disk applications will behave saner if they see a partition. Anyhow, so you've got /dev/sda1, so you initialize it (pvcreate /dev/sda1), you create a volume group (vgcreate myvgblah /dev/sda1), and you spin out your first LV (lvcreate -l 100%FREE -n mylv myvgblah). Hooray, you create your filesystem (mke2fs -j -L bigfs /dev/myvgblah/mylv) and mount it for regular use. Now sometime later you fill up that 2TB and realize that there's a pile of unused space. Well, you carve out another LD with the remaining 1.6TB which appears to the OS as /dev/sdb. Generally, I would expect this device to just show up, no rebooting or any crap like that. So you throw a partition on there, initialize the PV (pvcreate /dev/sdb1), add it to the existing volume group (pvextend myvgblah /dev/sdb1). With this free space, you can either add it all (lvextend -l 100%FREE /dev/myvgblah/mylv) or you could add it incrementally (lvextend -L +100G /dev/myvgblah/mylv) reserving free space for snapshots, additional LVs, and future growth.
Very handy to have all your disks in a pool (your VG) and be able to add logical drives (LVs), snapshot your drives, and incrementally expand your drives.
- Arch
Friday, 10 September 2010
Tab Mix Plus Trick
I had been using a Firefox plugin called New Tab Jumpstart which for new tabs shows like a splash of recently used pages much like you get with Chrome. I found that it was rarely useful and I was only using a single page from it, if anything. So I removed that plugin and found the feature I needed in Tab Mix Plus. You can control what appears in a new tab including a specific URL. Since my "home page" is 3 pages, the "home page" isn't quite what I need, but a specific URL does just the trick.
So there, now I use 2 features of Tab Mix Plus, but it was already #1 in my Essential Plugins simply for the mouse-wheel tab scrolling.
So there, now I use 2 features of Tab Mix Plus, but it was already #1 in my Essential Plugins simply for the mouse-wheel tab scrolling.
Tuesday, 3 August 2010
Access Control Lists and Ubuntu
Basic UNIX permissions: Owner, Group, Others and each with Read, Write, Execute, plus a handful of special permissions (setuid, sticky bits, etc). Covers 90% maybe say 99.9%, but not 100%. Sometimes, you really just want to grant more than just the "owner", "group", "everyone" permissions so you need Access Control Lists (ACL).
To get ACL support, your file system must support ACLs. If you're using a file system created this century, it probably supports ACLs. ACL support is usually an option for the file system which can either be set to default on (with tune2fs for example) or can be turned on at mount time with the "acl" option (e.g. in fstab). Some distros simply default the file systems to have acl on (Fedora, RedHat EL) and others don't (Debian, Ubuntu).
To view or manipluate ACLs you also need acl tools: getfacl and setfacl. Distros usually have a package called "acl" available which provides these utilities and with the distros that have ACL defaulting on for file systems (RedHat etc), the package is pre-installed.
First thing you'll want to know is how to read an ACL. The utility "getfacl" (Get File ACL) can show you the ACL. This is what a file looks like that doesn't have an ACL:
For files that have ACLs, you will see they have a "+" in their permissions list when using your regular ls -l and then you can view the ACL again with getfacl:
As you can see, this is the same directory, but rather than granting global read/execute as under UNIX permissions, we've granted instead read/execute to two specific users with ACLs. These ACLs were created with setfacl (Set File ACL):
If you get some error trying to use "setfacl", it's because the file system does not have the ACL option turned on. Add "acl" to the mount point in fstab and then remount the file system.
The last handy thing you may want to know is that getfacl and setfacl can be used to dump and restore ACLs. With getfacl, you can recursively pull all ACLs and skip files that have only base ACLs (UNIX permissions only). This dump can then be re-applied with setfacl. You will find this useful as not all tools that handle files handle ACLs - specifically tar.
That's Access Control Lists for you. There's no reason not to use them - they're widely supported and very useful.
Enjoy!
- Arch
To get ACL support, your file system must support ACLs. If you're using a file system created this century, it probably supports ACLs. ACL support is usually an option for the file system which can either be set to default on (with tune2fs for example) or can be turned on at mount time with the "acl" option (e.g. in fstab). Some distros simply default the file systems to have acl on (Fedora, RedHat EL) and others don't (Debian, Ubuntu).
To view or manipluate ACLs you also need acl tools: getfacl and setfacl. Distros usually have a package called "acl" available which provides these utilities and with the distros that have ACL defaulting on for file systems (RedHat etc), the package is pre-installed.
First thing you'll want to know is how to read an ACL. The utility "getfacl" (Get File ACL) can show you the ACL. This is what a file looks like that doesn't have an ACL:
getfacl torrentflux
# file: torrentflux
# owner: www-data
# group: www-data
# flags: -s-
user::rwx
group::r-x
other::r-x
For files that have ACLs, you will see they have a "+" in their permissions list when using your regular ls -l and then you can view the ACL again with getfacl:
$ ls -l
drwxr-s---+ 7 www-data www-data 4096 2009-11-21 15:06 torrentflux
$ getfacl torrentflux
# file: torrentflux
# owner: www-data
# group: www-data
# flags: -s-
user::rwx
user:archangel:r-x
user:aandrea:r-x
group::r-x
mask::r-x
other::---
As you can see, this is the same directory, but rather than granting global read/execute as under UNIX permissions, we've granted instead read/execute to two specific users with ACLs. These ACLs were created with setfacl (Set File ACL):
$ setfacl -m user:archangel:rx torrentflux
$ setfacl -m user:aandrea:rx torrentflux
If you get some error trying to use "setfacl", it's because the file system does not have the ACL option turned on. Add "acl" to the mount point in fstab and then remount the file system.
The last handy thing you may want to know is that getfacl and setfacl can be used to dump and restore ACLs. With getfacl, you can recursively pull all ACLs and skip files that have only base ACLs (UNIX permissions only). This dump can then be re-applied with setfacl. You will find this useful as not all tools that handle files handle ACLs - specifically tar.
That's Access Control Lists for you. There's no reason not to use them - they're widely supported and very useful.
Enjoy!
- Arch
Sunday, 1 August 2010
DSL Speeds
Just came across this article on the BBC:
http://www.bbc.co.uk/news/technology-10774406
"The survey found that for DSL services advertised as being "up to" 20Mbps, only 2% of customers got speeds in the range of 14-20Mbps. Of the others, 32% were getting a 8-14Mbps service and 65%, 8Mbps or less."
2% of users get 75% (or better) of advertised speeds? That's pretty damned harsh. That's the kind of thing that your customers ought to know up front.
But that's DSL for you. The article gives a fairly good explanation of some of the reasons why DSL sucks. What we need is fiber-to-the-home and none of this DSL crap:
http://www.newswire.ca/en/releases/archive/February2010/04/c6687.html
http://seekingalpha.com/article/197137-competition-is-starting-to-weigh-on-rogers-communications?
http://www.bbc.co.uk/news/technology-10774406
"The survey found that for DSL services advertised as being "up to" 20Mbps, only 2% of customers got speeds in the range of 14-20Mbps. Of the others, 32% were getting a 8-14Mbps service and 65%, 8Mbps or less."
2% of users get 75% (or better) of advertised speeds? That's pretty damned harsh. That's the kind of thing that your customers ought to know up front.
But that's DSL for you. The article gives a fairly good explanation of some of the reasons why DSL sucks. What we need is fiber-to-the-home and none of this DSL crap:
http://www.newswire.ca/en/releases/archive/February2010/04/c6687.html
http://seekingalpha.com/article/197137-competition-is-starting-to-weigh-on-rogers-communications?
Thursday, 1 July 2010
Upgrade from Ubuntu Server 8.04 to 10.04
Well, decided that today was the day to do the upgrade of my server, Alia, from 8.04 to 10.04. And, since I'm able to post, you can guess that it went generally fine.
It was quite brilliant really. I just ran the following command and followed the prompts:
So far, everything looks good. New kernel (2.6.32 from 2.6.24), MySQL (5.1 from 5.0), Apache, Postfix, slapd, etc etc. The one that looks like needs some babysitting is Dovecot which requires an updated config file.
Everything else worked "out of the box". And I'd consider this system fairly customized in the sense that a wide variety of applications have been installed but where possible (and almost entirely), taken from the Ubuntu repositories.
So if there's anyone else out there still waffling, do it! Do the upgrade!
- Arch
It was quite brilliant really. I just ran the following command and followed the prompts:
do-release-upgrade --proposed
So far, everything looks good. New kernel (2.6.32 from 2.6.24), MySQL (5.1 from 5.0), Apache, Postfix, slapd, etc etc. The one that looks like needs some babysitting is Dovecot which requires an updated config file.
Everything else worked "out of the box". And I'd consider this system fairly customized in the sense that a wide variety of applications have been installed but where possible (and almost entirely), taken from the Ubuntu repositories.
So if there's anyone else out there still waffling, do it! Do the upgrade!
- Arch
Wednesday, 23 June 2010
Keeping Copies of Group Emails
One of the things that's a bit ghetto of groups in Google Apps is that groups are really just a glorified alias file. Users cannot manage their subscription, get emails delivered in batches, and there's no message archive unlike Google Groups or a Mailman managed list. And this is the same problem with Microsoft Exchange (at least up to 2007, probably 2010 too).
Okay, so ranting aside, here's a couple quick hacks to squeeze a couple features out of groups in GA.
Archiving. Create a mailbox, add it to the group. Shazzam! This is better in Exchange were you can share that mailbox easily with many users and limit them to read-only access so people aren't deleting your archive.
Mailing list features. Well, you're only answer for now is going to be to forward messages to a mailing list. So point mylist@example.com to mylist-example-com@googlegroups.com and members should subscribe directly to the Google Group instead.
Aliases. Now this is one feature I would have preferred in the face of the above limitations of GA groups. That is, if I've got a group called "hibuddy@example.com", I also want to have "heybuddy@example.com" and other variations. So here, create a mailbox called "hibuddy@example.com" and rename (or create) a group called "hibuddy-group@example.com". You can add as many aliases as you want to the mailbox, and then configure that mailbox to just forward to the group.
Ciao
- Arch
Okay, so ranting aside, here's a couple quick hacks to squeeze a couple features out of groups in GA.
Archiving. Create a mailbox, add it to the group. Shazzam! This is better in Exchange were you can share that mailbox easily with many users and limit them to read-only access so people aren't deleting your archive.
Mailing list features. Well, you're only answer for now is going to be to forward messages to a mailing list. So point mylist@example.com to mylist-example-com@googlegroups.com and members should subscribe directly to the Google Group instead.
Aliases. Now this is one feature I would have preferred in the face of the above limitations of GA groups. That is, if I've got a group called "hibuddy@example.com", I also want to have "heybuddy@example.com" and other variations. So here, create a mailbox called "hibuddy@example.com" and rename (or create) a group called "hibuddy-group@example.com". You can add as many aliases as you want to the mailbox, and then configure that mailbox to just forward to the group.
Ciao
- Arch
Wednesday, 12 May 2010
Clonezilla Good! Fire Bad!
Clonezilla, quite simply, is tha bomb. It's really fast, very flexible, it will do everything including your laundry.
You get basically two styles of cloning systems (or disks in general). Either one at a time with the LiveCD or many at a time with a multicasting server. I've only tried the liveCD method since I was simply doing two hosts. And in my case, I was dealing with the 'doze which is always more of a pain than it should be. So here's what I did to clone a Windows Server 2003 install to two hosts.
Then on each target host,
The crazy thing I was finding was that "proprietary" cloning tools were hard to find. Basically, Symantec has been buying up everyone in the field, killing the products, and then telling everyone to use Ghost which at least since when they acquired Norton and until recently, did not take offline disk copies. Instead, you have to install the application in the OS (which you'll note with Sysprep is impossible since the host is SHUT OFF) and it does a "hot backup". It just doesn't work for cloning at all. WTH?
But apparently, between some more sophisticated usage of sysprep and using a "clonezilla server", you could have your PCs, say in a lab, all doing PXE boot, re-imaging themselves, and picking up their name and domain information simultaneously. Once setup, you could do a lab of, I don't know what size, but whatever the max number of clients is (presumably dozens or hundreds) in less time than it takes to get a Starbucks.
- Arch
You get basically two styles of cloning systems (or disks in general). Either one at a time with the LiveCD or many at a time with a multicasting server. I've only tried the liveCD method since I was simply doing two hosts. And in my case, I was dealing with the 'doze which is always more of a pain than it should be. So here's what I did to clone a Windows Server 2003 install to two hosts.
- Get the Windows host installed and setup with all the desired applications but not joined to the domain
- Create an unattended install file for Sysprep (it's a quick wizard)
- SAVE THAT SYSPREP FILE (for some reason, sysprep will destroy this as incriminating evidence?)
- Sysprep the host - this will strip the Security ID (SID), computer name, and remove it from the domain (if you had it on one) and it shuts down the host
- Get the Clonezilla LiveCD and something for external storage
- Boot the sysprepped host from the liveCD
- Basically defaults all the way, it will ask what the storage media for system images is, what disk or partition to copy (I did it by partition, though you could do disk if you wanted to keep the partition info)
- It ripped a 5.4GB base server install into a ~2GB image in about 5 minutes
- Reboot, reconfigure PC with a name, join it to the domain, etc
Then on each target host,
- Boot from the Clonezilla LiveCD
- Attach the external storage
- Follow the wizard
- It restored the above partition for me in 2 minutes, 17 seconds
- Reboot, give the PC a name, put it in the domain, etc
- Repeat for each host you are cloning
The crazy thing I was finding was that "proprietary" cloning tools were hard to find. Basically, Symantec has been buying up everyone in the field, killing the products, and then telling everyone to use Ghost which at least since when they acquired Norton and until recently, did not take offline disk copies. Instead, you have to install the application in the OS (which you'll note with Sysprep is impossible since the host is SHUT OFF) and it does a "hot backup". It just doesn't work for cloning at all. WTH?
But apparently, between some more sophisticated usage of sysprep and using a "clonezilla server", you could have your PCs, say in a lab, all doing PXE boot, re-imaging themselves, and picking up their name and domain information simultaneously. Once setup, you could do a lab of, I don't know what size, but whatever the max number of clients is (presumably dozens or hundreds) in less time than it takes to get a Starbucks.
- Arch
Tuesday, 20 April 2010
Launching Outlook Calendar
I wouldn't normally post about using an application, but nevertheless, this is a very handy trick for me. I usually run Thunderbird and Outlook Web Access (OWA). OWA is good for viewing your calendar, not so much shared calendars. And if I launch Outlook when Thunderbird is already running, Exchange goes crazy with my inbox. So, I often find I want to launch Outlook but only for the calendar. Microsoft has a handy page on how to Customize Outlook to start with the Calendar open. And in summary, you just need to add this to whatever shortcut you use to launch Outlook:
I added this to my quick launch link, the only downside is every time Outlook is updated, that link gets stomped.
But that's it.
- Arch
/select outlook:calendar
I added this to my quick launch link, the only downside is every time Outlook is updated, that link gets stomped.
But that's it.
- Arch
Monday, 29 March 2010
GUI Bad! SQL Good!
SQL Server (2005) hasn't been very kind to me lately. Among it's many faults, the one that cheesed me today was that in the course of testing a problem, I wanted to take one of the databases offline, do some stuff, and bring it online again. Well don't do this through the management studio GUI. When we did this, the offline process just hung there. According to Pinal Dave, the recommended way of doing things is like this:
This way, if there's some wedged transactions (as were the source of our problems in the first place), this should rollback anything that doesn't finish in 30 seconds.
- Arch
ALTER DATABASE [mydb] SET OFFLINE WITH ROLLBACK AFTER 30 SECONDS
This way, if there's some wedged transactions (as were the source of our problems in the first place), this should rollback anything that doesn't finish in 30 seconds.
- Arch
Thursday, 25 February 2010
Nagios Agents (NRPE)
In an earlier post , I mentioned Nagios as a system monitoring tool. It's simple, it's flexible, and out of the box, you can monitor network services without any software installed on the monitored systems.
Now if you want to monitor other aspects of a system, like it's disk usage, you can either make that information generically visible on the network (say with SNMP) or you can install an agent for Nagios. The most common agent is NRPE.
Like everything else in Nagios, you first need a plugin for Nagios to be able to check nrpe and there's a standard package available called, well, check_nrpe. Use your package manager of choice to install this plugin (nagios-plugins-nrpe in Fedora). I found that although this installed the Nagios plugin, it did not create a command definition so I created one myself. First run the check_nrpe command manually to see what arguments it takes and then add your command definition to your Nagios configuration. It should look something like this:
The command definition specifies the name of the command and then simply it's invocation. The macros given ($USER1$, etc) are pretty generic and it's pretty easy to work from existing command definitions or the Nagios documentation.
Now once you get NRPE installed on a client, the service definition is going to look something like this:
You should be able to get the NRPE agent installed on many "Linux" distros from the package manager. The agent can either run under inetd (preferred) or as a stand-alone daemon. If you are using xinetd (which you should), make sure you specify the Nagios server in the only_from line, enable the service and then kick xinetd. Since you're using xinetd, basically all the service configuration is there leaving really only the command definitions in NRPE's main config file (/etc/nagios/nrpe.cfg). In the main config file, you are going to specify the commands that can be run. Here's the definition for the check_root command:
As you can see, the command definition provides all the arguments needed such that the Nagios server should not ever have to pass any arguments to NRPE. This is for both safety and simplicity.
Now you're done! Reload your NRPE and Nagios processes and check back in a few minutes to ensure your service check is working. If it's not, typical issues are that the port is firewalled (TCP 5666 by default) or the Nagios host was not specified correctly in the only_from line (or the allowed_hosts line if not using a xinetd).
Next up is to monitor a Windows host. Since Microsoft doesn't have a convenient software repository of third-party applications, you get to go download and install an agent yourself. There are a handful of choices but generally, NSC++ (NSCP) will be the one you want. It supports a variety of protocols including NRPE and NSCA (NSCA is for submitting passive checks). When you install NSCP, the installer will let you enable NRPE and should handle setting up NRPE as a service and opening the firewall for it. The one thing you have to do is either enable external scripts (preferred) or enable arguments. There are a handful of stock scripts and aliases provided which get you most of the basic functionality like checking disk usage etc.
One last note is that you can always quickly check if the NRPE (or NSCP) process is talking to the server okay by simply running the check_nrpe plugin manually giving it only the host. It will report OK if NRPE is working or an error if it is not:
- Arch
Now if you want to monitor other aspects of a system, like it's disk usage, you can either make that information generically visible on the network (say with SNMP) or you can install an agent for Nagios. The most common agent is NRPE.
Like everything else in Nagios, you first need a plugin for Nagios to be able to check nrpe and there's a standard package available called, well, check_nrpe. Use your package manager of choice to install this plugin (nagios-plugins-nrpe in Fedora). I found that although this installed the Nagios plugin, it did not create a command definition so I created one myself. First run the check_nrpe command manually to see what arguments it takes and then add your command definition to your Nagios configuration. It should look something like this:
# 'check-nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ $ARG2$
}
The command definition specifies the name of the command and then simply it's invocation. The macros given ($USER1$, etc) are pretty generic and it's pretty easy to work from existing command definitions or the Nagios documentation.
Now once you get NRPE installed on a client, the service definition is going to look something like this:
define service{
use generic-service
host_name Hudson
service_description DISK_ROOT
check_command check_nrpe!check_root
}
You should be able to get the NRPE agent installed on many "Linux" distros from the package manager. The agent can either run under inetd (preferred) or as a stand-alone daemon. If you are using xinetd (which you should), make sure you specify the Nagios server in the only_from line, enable the service and then kick xinetd. Since you're using xinetd, basically all the service configuration is there leaving really only the command definitions in NRPE's main config file (/etc/nagios/nrpe.cfg). In the main config file, you are going to specify the commands that can be run. Here's the definition for the check_root command:
command[check_root]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /
As you can see, the command definition provides all the arguments needed such that the Nagios server should not ever have to pass any arguments to NRPE. This is for both safety and simplicity.
Now you're done! Reload your NRPE and Nagios processes and check back in a few minutes to ensure your service check is working. If it's not, typical issues are that the port is firewalled (TCP 5666 by default) or the Nagios host was not specified correctly in the only_from line (or the allowed_hosts line if not using a xinetd).
Next up is to monitor a Windows host. Since Microsoft doesn't have a convenient software repository of third-party applications, you get to go download and install an agent yourself. There are a handful of choices but generally, NSC++ (NSCP) will be the one you want. It supports a variety of protocols including NRPE and NSCA (NSCA is for submitting passive checks). When you install NSCP, the installer will let you enable NRPE and should handle setting up NRPE as a service and opening the firewall for it. The one thing you have to do is either enable external scripts (preferred) or enable arguments. There are a handful of stock scripts and aliases provided which get you most of the basic functionality like checking disk usage etc.
One last note is that you can always quickly check if the NRPE (or NSCP) process is talking to the server okay by simply running the check_nrpe plugin manually giving it only the host. It will report OK if NRPE is working or an error if it is not:
[root@alma nagios]# /usr/lib/nagios/plugins/check_nrpe -H hudson
Connection refused by host
[root@alma nagios]# /usr/lib/nagios/plugins/check_nrpe -H hudson
NRPE v2.12
- Arch
Tuesday, 26 January 2010
Essential Application Plugins
The nice thing about programs like Firefox and Thunderbird is that you can get a lot of community-created plugins to make the program look and do what you want. The downside of programs like Firefox and Thunderbird, is there is (at least for me) a few plugins that have to be installed before they work well. So to that end, I've started building up a list of essential plugins.
The plugin model isn't perfect, but it far exceeds the alternative which is that your applications all suck (Microsoft, I mean you). Heck, Nagios at the core doesn't do anything at all for you, it's all from plugins and I can't rave enough about how great an application Nagios is.
- Arch
The plugin model isn't perfect, but it far exceeds the alternative which is that your applications all suck (Microsoft, I mean you). Heck, Nagios at the core doesn't do anything at all for you, it's all from plugins and I can't rave enough about how great an application Nagios is.
- Arch
Subscribe to:
Posts (Atom)
Popular Posts
-
For anyone who's had to cleanup some mail problems with Postfix configuration (or more often with other things, like anti-spam, tied in ...
-
In the course of troubleshooting the office Jabber server the other day, I came across some interesting info about the various caches that O...
-
For everyone who uses cron, you are familiar with the job schedule form: min hr day-of-month month day-of-week <command> A problem...