Getting There: Successful Goal Setting

Yesterday I had an interesting discussion with our other web guy.  We recently underwent a fairly large redesign of our web site, which included the addition of new features and tools (read: more work for everyone).  One of these is the addition of a home page centerpiece linked to stories about the people of the college: students, faculty, and alumni.  The thing is, it’s not our job to write the stories, and it’s not necessarily PR’s problem that it was determined we need lots of these.  And so the two groups talked yesterday about the scenario.

goalThis was all relayed to me, which got me thinking: What are they trying to accomplish?  There were lots of ideas thrown about, such as changing the release schedule, doing stories on campus things, highlighting groups or programs, etc.  My main concern was that some these worked against our original theme of showcasing the people of the university.  In all fairness, this original idea was made pretty arbitrarily.  We made an assumption that showcasing successful people is a good use of that space.
Us:  “What should go there?”
Them:  “I dunno, people like seeing people.”
Us:  “Okay, people it is!”

The problem now is keeping up, and trying not to impact people’s already full work load.  But I believe that there was a clear deficit in the problem that they missed.  What were they trying to solve?  There’s a lot of talk about problem solving and goal setting that goes on in the business world, and for good reason.  How can you effectively tackle a task without defined parameters?  In this case, the problem was two-fold: workload and story quality.  One of the ideas thrown out was doing stories on things like building history.  But, does this really solve initial problem?  Of course not.  You still have an issue of needing someone to write the story, and that writing still has to be concise, engaging, and compelling.

Our original line of reasoning, to develop a people-centric centerpiece, was sound enough, I think.  As a university, our commodity is people, after all.  Potential students, parents, and donors need to see a successful product, just as if we were making suspenders, or clever shoe horns.  If all we had to show were men with their pants around their ankles and shoes thrown to and fro, we wouldn’t be putting forth a successful air… we’d be Congress.  Using our above example, how could we better tackle this?

  • Define the problem: Be clear about what exactly it is that’s wrong. In education, there’s always someone around to tell you things are broken, we’re here to help define and explain exactly what that is.  Sometimes this means recognizing that the problem presented isn’t the actual cause.  Be sure you’re at the root issue.
  • Brainstorm answers to those problems: If the workload is too high, perhaps set up a slower release cycle, or stagger new stories.  The important thing is that these develop as an evolution from defining the problem.  In our case, tossing out alternative story ideas is certainly a different option, but it’s not one that addresses the problem.  The catch here is that good lateral thinking can create neat ideas, something we’re prone to get, just not necessarily productive ones to the problem at hand.
  • Simplify: Take some of the potential solutions and see how far you can distill them down and still be successful.  Bells and whistles can come later, for now, keep your eye on the ball.  It will save you time and frustration.
  • Decide how to execute the solution: If you’re changing the release cycle, what will you change it to?  If you change an editorial process, what workflow will be used?  Be clear about defining resources and responsibilities.  If PR needs to write X and marketing needs to review it, get that down.  That way if the process breaks down later, you can refer people back to the original plan.
  • Execute: Set a time to get this ball of goo rolling, and do it as quickly as you can after you’ve decided what to do.  Don’t give ideas time to get stale, or the problem time to magnify.
  • EVALUATE: There’s nothing that says problem solving like being able to show you did it.  This deserves a deeper look.

The whole point in problem solving is to set goals, and measure them.  In the end, it’s all well and good if we choose a couple ways to change our story writing process, but if we don’t have a goal, not only for the problem solving but for the original need, how do we know if what we are doing is working?  In this case, we need a primary goal, one for the centerpiece, and a secondary goal, one for the problem of story production.  Overall, our goal is to showcase the people of the university.  All well and good, but this isn’t really measurable.  What we need to say is that we want to get X% of home page views a month to clickthrough the centerpiece.  We would do this with enticing photography, and engaging quotes.  From there we might even say we want an additional X% of those people to go on to visit a related page.  There we have a defined goal, explaining how we’ll do it, and how we’ll measure success.

With the writing problem, we would say our goal is to generate Y new stories a semester, staggering their release ever Z weeks, and editorializing them through a new workflow.  The measurement for success would be twofold.  One, are we hitting our production cycle, and two, is our primary goal being met?  If our primary goal isn’t met, we need to sit down again and evaluate if the problem is still in the production cycle, or if it’s something new, like needing more exciting photos, better placement, more concurrent stories, etc.  Thus the process could start over.  But by measuring our goals, we improve our ability to define problems, and determine a course of action, and identify deficits in the future.

Ultimately, your best key to success is a clear problem definition, and somebody to lead your discussion logically to a solution.  We tend to come at a lot of problems with a “throw a bunch of stuff against the wall and see what sticks” mentality.  We’re also very good at going off on tangents or the previously mentioned lateral thinking.  That’s just how tech people are.  We love pretty, shiny objects.  When you get neat or useful ideas, you need to weigh them against what you’re trying to accomplish and perhaps put those off until a better time and place.  While these new, shiny ideas can frequently accomplish some kind of change, that change isn’t always solving the original problem.  The result is something different that just forces you to keep revisiting the same problems.  You just need to stay focused and keep the original problem and goal in sight.  Once those are addressed, you can start doing some of the new things, which will undoubtably create an entirely new set of problems for you to play with.

How Safe Is Your Job?

There was a point, about three years ago, where I was seriously wondering what job security in a jack-of-all-trades style web job was really like.  Was there value in it?  Was there a future in it?  These kinds of questions are inevitable when you consider where you are, where you can go, and what you can do between here and there.  The web is a big place, and those darn kids that won’t get off our lawn are sharper about web stuff than they used to be.  Are we destined to become a dime a dozen, especially at universities where we are effectively training people to replace us every day?

chickencubicleI think it’s always important to weigh your career once in a while, and determine what the longevity of it is, both as a job, and as a means to keep yourself satisfied.  That was where a lot of these questions originally came from for me.  I was trying to determine how to further develop myself professionally, and in what ways to do it.  The first thing I think I was able to really work out was that yes, we should keep ourselves sharp (a given), and that yes, the web is a great industry to be in.

I, for one, don’t view the younger generations as a threat just because they are young and have been exposed to the internet from birth.  Had the web stood still, and never evolved beyond basic HTML, I think it would be a totally different story.  Luckily, the web is crazy dynamic, and evolving in ways that can blow even the most hardened veteran’s mind if they think about it too hard.

Compare it to a Model T.  If automobiles were all still as simple as Model Ts, we’d all be mechanics by now, and we’d never need shops.  That’s how the web in a bubble would be.  But as cars have evolved and become more complex, so too does the web.  We will forever need mechanics the same as we will always need “web developers” (whatever that really means).  And at universities, we’re always short on skilled people and resources.  It might be easy to find a talented youngster to fill a spot, but it’s much harder to keep them when someone else can wave almost twice the money under their nose.

The other side of this point, being a jack-of-all-trades, is equally valuable.  As I mentioned, many of us work in shops where our staffing is painfully low.  It pays then to be good at everything and great at nothing.  I’m not saying that you shouldn’t strive to be experts in things, it’s just hard to do, and as a human, I require sleep once in a while.  If you network well and know your resources, even the hardest questions can be quickly answered (thank you Google).  Then as you grow in your position, you become better primed for project management and leadership positions.  The captain of a ship might not be able to fix the engine of his boat, but a good captain would at least be able to talk to the mechanics on their level and know what he can and can’t ask for.  An expert programmer might be able to hammer out Ruby or Java code like nobody’s business, but by specializing in that, they sacrifice other skills.

Ultimately, if you’re happy in what you are doing, and your boss is happy with you, then it’s all good.  But I think we all need a little to look forward to from time to time.  Where do you think the most valuable skills lie in university web positions?  Project management? Development?  Design?  Information Architecture?  Where are you putting your chips at?  And by investing in those skills, do you think doors can continue to open for you on your current ladder, or have you already hit the ceiling?

Getting Synthy with Virtual Tours

Those of you who have kept up with some of Microsoft’s new toys (or who read my Twitter), have undoubtedly heard of a new little Seadragon based photo interface they have been working on in conjunction with the University of Washington called Photosynth.  This new spatial photo organization system sent the tech word abuzz when news, video, and a tech demo began passing back and forth across sites like Digg anv Slashdot.

photosynthA few days ago on August 20th, Microsoft officially took the technology out of the “look but don’t touch” phase, and began enabling users to log in and create their own “synths.”  This amounts to going out, taking a ton of pictures of something, and uploading them.  There’s no other user intervention needed aside from naming the synth and tagging it.  The system calculates groups, intersections, perspective, placement, etc.  Great on the time saving, not so great if you want to adjust something.  I noticed in my tests that a couple times it didn’t connect groups that clearly went together, and there’s no way to tell it otherwise yet.

My first instinct is that even though this is a little heavyweight on computing power needs, it’s an awesome idea for virtual tours of campus.  I was recently considering doing a bunch of videos around the university – walking around, looking at things, and other tasks.  The idea is to give potential students a better feel of “being there.”  Instead, I’m considering throwing in some time to build out some of the key areas of campus in PhotoSynth (along with the videos as well).  Some colleges are already playing with this.  The reason I see this as a huge opportunity is that it moves out of the passive nature that video has, and creates an active environment they can explore with about as much detail as you make possible.

Furthermore, the ways they are working on improving this technology are simply awesome.  It really gets me thinking about how interesting it is that 2D still images could really beat out video as an interactive tool on the internet.  In reality, almost everything is more interactive than video, but video is viewed as being more dynamic since it has moving pictures.  Just about the time Photosynth went live, this video came out detailing some of the advances they are already making for the photo tourism software.

Examples

I worked up four quick examples just to test run things, which I have linked just below.  I found that with anything of any size you should really try to hit with at least fifty shots.  I maxed my synths about 120 or so, but could see using way more on other things in the future.  I also found that it works much better if the camera is not necessarily the center of the synth.  Instead, try to make the camera be outside looking in.  In the first example of the Russ Hall stairway, clicking the “Switch to the next 3D Group” button will show you how many different groups it put together, because it seemed to have some problems organizing things in an indoor 3D space.  It also probably didn’t help that I was moving all over trying to cover things with different angles.  Note that in that video I mentioned in the previous paragraph, it looks like they are working on this very issue though.  Outside works much nicer though.  Taking a target and moving around it seems to produce nicer results.

You can also see how things work from a single vantage point (JungleTron), and how the orbits come together (statue).

What’s Good

  • Embeddable: You can embed the synths in existing web pages, and it has a nice splash prompt asking users to download the plugin if they haven’t (the splash even shows that specific synth’s thumbnail).  U of W’s web site also shows an example using a Java applet, though that doesn’t appear to be a publicly available feature yet.
  • Engaging: Rather than just asking a user to sit and watch something, or flip through a handful of disconnected images, you are encouraging someone to interact with your site and campus.  Consider the potential for Easter Eggs.
  • Easy (what’s with all these E words?): While the ease of use forces you to sacrifice control (see the other E word below), it makes it stupidly easy to deploy and use, which is a great time saver and makes the barrier to entry low.
  • Single Image View: Users don’t have to remain trapped in the interactive world if they don’t want to, and can switch to a simple thumbnail gallery.
  • Storage: Microsoft was generous in giving users 20GB of storage space for synths.  That’s a lot.  For comparison, my four synths are built from 364 1600×1200 images that took up .3GB (that’s point three, or a little over 300MB).

What it Needs

  • A preview mode: This is especially true if you are shooting in hi-res of any kind.  Prior to taking all the time to upload a couple hundred megs of photos, it’d be nice if you could make sure it looks right.
  • Better indoor modeling: Right now Photosynth seems to have some trouble modeling from inside a structure looking out. Recent demonstrations indicate this will be improved soon.
  • Manual stitching: If Photosynth defines a set of photos as a separate 3D group, it’d be nice if you could manually match it to another 3D group that it goes with if it made a mistake.
  • More OS support: Few tears will be spilled over those of us crazy enough to use Linux, but the Mac crowd is big enough that Microsoft needs to get on the ball at the risk of Apple, et al doing it better to please their customers.
  • Clean permalinks: Hopefully they will hurry up and do something about the links to individual collections.  At the moment, they contain long, and impossible to verbally telegraph synth IDs.  It would be nice to see something more like http://photosynth.net/username/synthname.
  • Editing: Currently, you can’t add or remove photos from a completed synth, and you can’t edit existing photos in place. You also can’t download all the images used as a batch (that I know of).

Is it Ready?

Unfortunately, not just yet, at least not as a primary campus marketing tool.  But, if you have the time and resources to start playing with it, I think there’s plenty of value in it, even at this early stage.  Since the whole process (besides taking the photos) is automated, it doesn’t actually take much to slap a synth together and post it.  Imagine groups like your art department and what they could show off with this kind of functionality.

I think the keys to success with this are all in the “What it Needs” section.  If Microsoft can bull ahead through those issues, particularly making it OS and browser agnostic, I could see this quickly becoming the tour software de jour.  I’ll also be interested in what sites like Flickr will do in response, be it license the softwre, or produce their own.  There is always the “Microsoft” variable that can and will keep plenty of people from commiting to it.  I don’t entirely blame them, as I am frequently against putting all your eggs in one third party basket.  The only reason I differ on that opinion in this case is because, well, there is no alternative yet, and given what I’ve seen in that latest video, I’m not sure there’s much improvement even necessary.

E-mails, E-mails Everywhere and Not a One to Spam

It was brought to my attention the other day that there are some concerns about e-mail addresses published on our college’s web site and the effect it has on spam.  It turns out the filters here run through about 10,000,000 emails a day, about 7% of which are passed on as being actual, legitimate messages.  We are not a huge campus, but I’m going to guess that many of you would see a similar ratio.  Naturally, this has brought up conversation of obfuscating e-mail addresses.  We’ll set aside the “closing the gate after the horse got out” metaphor for now, because techniques can always help prevent spam from hitting new addresses, so at least that way we can lighten the load for our new users.

email_codeOf course, the ultimate e-mail obfuscation problem is how to do it accessibly.  By its very nature, if you are making information accessible to those without JavaScript, or with screen readers, etc, then you are publishing data in a fashion that can be picked up by spammers.  There are plenty of methods that work great, and I’d be happy to use them on my personal blog or such, but they simply aren’t feasible for a college trying to be 508 compliant (or otherwise, depending on if your state has its own guidelines as well).  If you have done any research on this topic, you have undoubtedly come across A List Apart’s article on this subject.  In a lot of ways, the conversation can start and end there, because they’ve broken the issue down to the atomic level, and reconstructed it as gold.  But, there are some other methods, and some other considerations I want to point out, especially because I’m in a non-PHP environment now, so I had to go another route to find a solution.

CSS (Code Direction)

This technique came to me by way of Silvan Mühlemann’s blog.  I think of any method, this is both the easiest and coolest, and it works in FireFox and IE6.  The problem is, it’s also the worst.  It relies on the idea that you can take a string, and with a CSS attribute reverse the flow of the information inside the selector and make it readable.  So, when you write the address, you say moc.elpmaxe@liame, and with CSS reverse it to be email@example.com.  The reason this is bad though is twofold.  First, you can’t make it clickable.  The CSS only works on content within the selector, so you can’t manipulate an href, and obviously putting the email in as a plain href is as bad as having it normal in the page in the first place.  Secondly, it breaks copy + paste, because copying the text causes you to copy from the source, which is backwards.  So pasting it pastes the original moc.elpmaxe@liame.  If you make the link not clickable, you darn sure better not break copying.  The bad part is that Mühlemann’s blog reported a 0% spam rate over a year and a half on an address encoded in this manner, so it appears to be great at stopping spam.

CSS (display:none)

This faces pretty much all the same problems as the other CSS technique, but instead relies on using a span inside an email address to hide a part and make it human readable: email@<span style="display:none;">I hate spam</span>example.com.  A user can read the address without issue, but still can’t copy it, and you still can’t make it a link.

Character Entity Encoding

This is the practice of taking an email address and encoding all the characters into HTML entity values, so email@example.com becomes &#101;&#109;&#097;&#105;&#108;&#064;&#101;&#120;&#097;&#109;&#112;&#108;&#101;&#046;&#099;&#111;&#109;.  This is better than having an email in plain text (affecting a 62% decrease in spam volume over plain text), and it allows you to make it clickable.  However, it’s straightforward enough that it comes in second behind plain text as the easiest to get past, though the decrease in spam volume was fairly significant.

A similar, but alternative method that appears to reduce spam load by 92% over plain text is to mix in entities for the “@” and “.”, producing a mailto like email&#64;example&#46;com.  This is probably because the crawlers are set to ignore single occurrences of encoded entities, and with them there, the email doesn’t match an email pattern (at least until they get smart enough to match this pattern).

Both of these methods can be considered viable for accessibility purposes, and they make a big enough impact that one could serious consider employing them full time.

Insert Comments

Inserting comments results in addresses like email<!-- -->@<!-- @ -->example<!-- -->.<!-- . -->com.  This however fails the test to make the address clickable.  It is more effective than fully character encoding the address, but less so than selectively encoding the “@” and “.”, receiving about 444% more spam than that method.  Comments decrease spam by 11% over full on entity encoding of the address.

Build with Javascript

The process of using Javascript to concatenate the components of an email string is almost foolproof in its ability to trick spiders.  This relies on setting a couple variables and combining them all in a fashion similar to document.write("<a href=" + "mail" + "to:" + string1 + string2 + string3 + ">" + string4 + "</a>");.  But naturally this is a problem for those not using Javascript.  They would simply get no output where this is used, in other words, it doesn’t degrade gracefully.

Use “name AT site DOT com”

If you look around on blogs and forums, there is a growing trend to type out an email in the fashion of “username AT website DOT com,” or some variation thereof.  First, this doesn’t address clickability, and second, it’s not really a trick.  All spammers have to do is Google a phrase, like “AT gmail DOT com” (I got 10.3 million hits) and start saving matches.  Oddly enough though, this appears to produce less spam than building with Javascript, but the click problem combined with almost inevitable circumvention makes this pretty useless to us.  And personally, I’m not a fan of making a visitor do extra work to change a deliberately tweaked address if it is at all avoidable.

ROT-13 Encoding

ROT-13 is a basic substitution cypher, that rotates a character 13 places.  This allows it to be encoded and decoded very easily.  Using this to process email addresses appears to be one of the foolproof means of avoiding spam crawlers (along with the CSS techniques).  Here’s a basic tool that you can test the technique on.  PHP readily includes the str_rot13() function that can be used for this.  But one last time, you’re limited to people using Javascript.

A List Apart Method

Rather than explain this, go read their tutorial.  It’s very clever, and is probably the best alternative out there, but only if you are using PHP and can write some custom .htaccess URI rewrite rules.

So, given this boat load of information, where does it leave us? I think many of us in the educational circles can use A List Apart’s system for any of our emails that show up in dynamically generated listings.  Email addresses added to a page by an editor or such would have to be handled manually though (you can get around this with some additional work using Apache’s mod_substitution).  My solution is a combination of techniques.  Our CMS is Java based, so A List Apart’s methodology doesn’t exactly work.  But, what I can do is combine ROT13 encoding with a <noscript> alternative that incorporates an image generator and character encoded link to make it clickable.  This would create an image representation of the address that is properly alt tagged so that screen readers can still interpret the address and users could still click it.  I think this is a good blend in my case.  There is a URIRewrite application on our server as well that would allow me to do some of the A List Apart system in the future.  The point being, you don’t have to use only one solution, you can combine different options to try and get the best of every world.  But there is no magic bullet if you are trying not to break accessibility.

For many of us, the horse may already be out of the gates, so closing the gate now might not do much.  But we can at least try to ease the load on new addresses that become published, and make the spammer’s job harder (and make email admins less likely to gripe at you).  There’s no good excuse for handing over emails as plaintext when we have tools to easily avoid it.  And ultimately, if a human can read it, it’s inevitable that spammers will crack through it.  For the time being, that process isn’t cost effective for them though, so we might as well take advantage of it.

Is Hosted Search Really Ready for Prime Time?

In my years that I’ve now spent in higher education, one universal truth I have found is that nothing quite moves a project along like when someone much more important and much less web savvy than you deems an issue worth addressing.  Such was the case only a couple months after I had started at the university, when the Director of Marketing noticed that new information she had put up on the site wasn’t coming up in search results, and the results that were hitting weren’t particularly relevant to the topic in the first place.  Thus, a mission was born, to find a way to make our search better, and to do it NOW.  That’s the other thing about people higher up than you, when they say jump, generally you jump.

At the time (approximately three years ago), we had been using the pretty straight forward Google search for web sites.  It amounted to putting a box on your page that submitted to Google, restricting results to your domain.  You couldn’t really do anything else with it then besides add a banner to the top.  So began the odyssey.  Most of the major players all offered a basic site search back then, all of which fairly equally crippled.  The Google Search Appliance was (and still is) crazy expensive and totally overkill for our site.  The IBM/Yahoo product OmniFind was still a few months from launch (nor did we have hardware to run it on at the time).  The Thunderstone Parametric Search Appliance just looked a little… well, no one I know had ever heard of them, and their website wasn’t (and still isn’t) something that inspires my confidence.  The Mini, on the other hand, was cheap, more than adequate for our site size, and was getting good reviews.  Not to mention that the money to get it was ready, willing, and able.  All that made the choice pretty easy for us, so we dove in.

Now, fast forward a couple years.  We are still using our Mini.  In fact, I just upgraded to 5.0.4 on Monday.  I’ve never had a lick of problem with it and became a pretty quick fan of it.  This year at eduWeb I had the good fortune to share my experience with a couple people, and the conversation generally drifted towards: “Why is that better than Google Site Search?”  Originally, the Mini offered a ton of unique features, such as custom collections, theming, or the ability to export results as XML.  The past year has seen a growth in the availability and features provided by free, hosted search solutions.  Yahoo BOSS looks to be an API that wants to take a serious swing at the hosted search crown.  Google’s Custom Search Business Edition (CSBE) AKA Google Site Search is also offering business and schools the opportunity for search with many of the features of the Mini like ability to remove branding and ads and call results as XML (note: Google Site Search is free for universities).

With all these new options, is the Mini even a worthwhile investment now?  We’re coming up on the end of our support term, so I figured this was a prime time to evaluate the field.  My short answer is: Yes, it still is.  My long answer also happens to be yes.  See, search is important. Search is doubly important for universities because we have so much crap out there, and so many different topics to address (many of which also happen to be crap, but you can’t tell that to the people putting it out there).  A Mini now costs $3000 with 2 years of support, which would be equal to six years of equivalent CSBE service (assuming you had to pay) which prices out at $500 a year for 50,000 pages.  Obviously Google isn’t trying to mothball its own products, so where does the Mini make up that cost?

First, I think there’s huge value in crawling.  Remember our original problem?  Content was not making it into the search results fast enough.  With the Mini I can schedule crawls, or just set it on continuous mode and let it go nuts.  Using nightly scheduled crawls, I ensure that any content added to the web site shows up in search within 24 hours, and usually faster than that (unless some crazy person is up and adding content to the site at 12:01 AM).  Going through the Webmaster Tools, I can only tell Google to crawl our site at a Normal or Slower rate.  We don’t even rate high enough to get the Faster crawl rate option.  So users of Site Search are pretty well cornered on the matter.  Once I crawl our site with the Mini, I can have the it output a sitemap that I feed to Google’s spider to help with their indexing as well, so the benefit becomes twofold.

Next up, raise your hand if you have an intranet, or otherwise secured information not available to the public.  All of you can pretty well scratch CSBE/SiteSearch off your short list if you’re looking for a way to dig through it.  If you want to index any kind of protected content, you’ll have to go with an actual hardware solution, as both the Mini and GSA support mechanisms to crawl and serve content that is behind a security layer.  This is a great option if you buy a Mini, use up the initial two years of support, then buy a second one: use one for internet and the other for intranet.

You’re also going to find that you are capable of pulling more valuable metrics out of the Mini than what you get with CSBE/SiteSearch.  Granted the standard “what are people searching for” question is easily enough answered.  But what about “what are people searching for that isn’t returning results?”  That can be equally as valuable in a lot of cases.  And while Site Search allows for search numbers by month and day, the Mini can go down to the hour, as well as show you your current queries per minute.  It’ll even keep tabs on how many pages it’s crawling currently, how many errors it found, and email you about it all.  All the reports can be saved out as XML, naturally, so you can mix and match datasets as you need for custom reports.

dir1boxAnd I have one word for you: OneBox.  Mini has it, thanks to a trickle down effect from the GSA – hosted Google options do not have it.  The OneBox essentially allows you to add in custom search results based on query syntax, and tailor the styling of the results.  You see this all the time at Google, for instance when you type in a phone number, or FedEx tracking number.  As you can see, these results need not come from your Google Mini search index.  It can come from other collections, or other sources entirely.  In the screenshot to the right, you can see a mock up of a OneBox result that matches a name format and returns contact information along with the standard search results.  Uses for this are many, and can span anything you might store in databases, such as course listings, book ISBNs, names, weather (if you have campuses in different cities), room information, etc.  Anything that you can define some kind of search pattern for.

On a quasi similar note, you can also link certain searches (or parts of searches) to keymatches.  These are commonly used for ads on Google that appear at the very top of search results (usually highlighted light yellow with the “Sponsored Link” caption), but you can use them to highlight a link that goes right to the automotive department when someone searches for something containing the word “auto.”  This is another feature unique to the Mini and GSA, and one more way to make sure searches are presented with relevant links.  This is very useful in cases where a department might not have a well optimized site which doesn’t show up first in a search for their department.

Ultimately, it’s a judgment call whether or not these features are worth the money to you.  At $3000, you’re basically paying $1000 each for the server itself and two years of support.  You can’t buy the unit without support though, so that notwithstanding, you’re getting a full featured search box with support for about twice the cost of a good PC.  If you have more than 50,000 pages to index though, you’ll find that price goes up.  At the same time, if you do have over 50,000 pages, there are a lot of other reasons not to go hosted, such as control over results, index freshness, result relevance, etc.  All these are always important, but they become even more so the bigger your site is.  Consider, if you have half a million pages on your site, and you need to make sure people find the needle that they need to in that haystack, would you rather have some control over that, or cross your fingers and hope Google gets it right?

My end impression is the Google’s Site Search is a great little tool for small businesses that are dealing in a few thousand pages, who can’t afford a server, or who don’t have the resources to maintain it.  Keeping up the server isn’t an involved job at all, but does require someone capable of checking in on it monthly or so, at least.  But, as universities, we generally have the resources for such a tool, both financially and manpower-wise.  We’re also large enough to justify a dedicated box for such an important task.

If you’re still researching what’s right for you in hosted search, it might well be worth keeping an eye on Yahoo BOSS though, it’s making some pretty cool claims on functionality.  OmniFind is also great free software if you have the resources to run it already in place (like a VMWare cluster or other virtualized environment) and can function within its limitations (only having up to five collections being the big one).  Just remember, search is possibly the single biggest tool on your website behind maybe your portal, and it deserves due process to get the treatment and attention your users expect and deserve.