browsing News

Welcome NetConnect Folks!

Posted on Friday 19 October 2007

I’d like to welcome the NetConnect Members to the EffectiveMonitoring blog. The presentation Find it Fast: Using AppManager Stats to Identify Problems Quickly is now available on the Presentations page.

As mentioned during the presentation, this is an entire blog dedicated to monitoring techniques. These are developed regardless of the tool that you use, the underlying operating systems that you have, and the organization that you have in your environment. Although this latest presentation is of course about AppManager and Windows operating systems because of the conference. The general topic applies, and I’ll be covering those in more detail in later entries.

If you want to contact me for any reason, please just use the contact page. I’m happy to talk to anyone about the presentation, or your monitoring issues. I’ve helped people solve quite a few issues.

For now, you’re catching the blog in the middle of a discussion about how to create and apply alerts based on my 10 years of experience with monitoring tools.

As for a recap of the conference, it was very much worth my time. I had a great time and really had a lot of time to talk to a lot of other monitoring folks and also to people from NetIQ. I spent a lot of time trying to get certain enhancement requests in front of PMs and support. Being able to explain these things in person helped a lot, as many of them are seemingly complex issues that can be more simply conveyed in a 5 minute conversation than an email that seems like a book.

For example, there’s an issue with AppManager regarding handling events across reboots and restarts of the agent. A feature called event collapsing will make sure that only one event is triggered upon an error, and, depending on how it’s configured, a continuing failure will get just one event. These events can also trigger actions. For example, a page or email. So, let’s take a situation where it’s monitoring a website, and it’s set to event to the console, and send an email via SMTP. If a website is down, you’ll get one event, and one email, no matter how long it’s down. You’ll only get another email and event if the website comes up and then goes down again. A very useful feature, for certain.

But, if you reboot your monitoring agent server (the one that is monitoring the website, NOT the monitoring infrastructure backend, which is not tied to these events) you’ll get another event and email when it comes back up. You’ll also get one if you restart the monitoring agent, or if you change any of the monitoring properties. This leads to another alert to the people who will then think that the website has come back up and is back down again. This event persistence should save itself across reboots, restarts, or changes to the policy to avoid this problem.

I know that your eyes are probably glazing over reading this explanation, and it still may not make sense, especially if you’ve never used the product. But it’s a problem that has a real effect on large shops such as ours. I’m glad that I was able to bring this and other issues to their attention, because I hope that it can make a feature list. As you can imagine, I’m a very detailed person when it comes to these features, because these seemingly small feature issues can cause major problems in environments as large as the ones that I face.

I will, perhaps, put up a list of my wish-list for AppManager on the chicagoiq.net website (which I run.) If you’re an AppManager user and want to join together with my organization in case you have the same pain that we do feel free to join in there.

In the next post, which I might be able to write as I continue to wait for my plane back, will talk about some of the future topics that I’d like to cover based on talking to other companies. There’s a lot of problems that we’ve gotten past in our organization, and I think that there’s a lot that I can share that will save you quite a lot of time in your monitoring.

If my plane is delayed even more, who knows, I’ll possibly write even more articles while they’re still fresh. Meanwhile, I’m heading to my gate now.

The Break

Posted on Monday 8 October 2007

I apologize for the recent break in posts. I was just finishing writing a book that is coming out from St. Martin’s press next year. I felt guilty if I did any writing that wasn’t dedicated to finishing by my deadline. The manuscript is with the editor now, so I’ll be writing articles for this site, possibly multiple articles a day, to catch up.

In case you’re curious, the book is about how to run your own independent band. There’s a surprising amount of technology involved with music now. Considering that I’m a musician as an avocation (in my spare time), my skills as a technologist and as a musician on the side came in handy.

Also, I’ve never believed that you can have just one passion in life. Mine are information technology, music, and writing.

The next book I do will be on the topics in this blog, which I will share here in article-sized pieces.

Welcome ITSMF

Posted on Friday 24 August 2007

I’d like to welcome the ITSMF folks to EffectiveMonitoring.com!

I’ve had requests for the slides from the presentation that I gave on Thursday. I’m going to post it, and all others that I do, on the presentations page on this website.

Effective Monitoring designed by SEO-Themes and powered by Wordpress