Wednesday, July 3, 2002

Not as cool as Google, but updated more often.


I've set up a local search engine to
index all of MonkeySpeak,
based on ht://dig. I navigate to older articles often enough that a search feature is actually useful for me. Your mileage may vary.



ht://dig is pretty cool, though the way it pluralizes (and adds other suffixes to) search terms is a bit inconsistent. For example, if you search for "thing", it will match against "thing," "things", and even "thingness" (for some reason), but if you search for "lego", it won't match against "legos". I'll look into that when I get a chance.



2 comments:

  1. Does it search for "legoes" instead? Maybe it's just too clever by half, as it were.

    ReplyDelete
  2. > Does it search for "legoes" instead?
    No, whenever it considers suffixed variants, it tells you. (Search for "thing", and look at the very top of the result page.) Turns out ht://dig has simple word formation rules for prefixes and suffixes (based on those in ispell). It also has a synonym database. (Search for "chili", and see what else it searched for.) It uses these (and other tricks) to try to make searches more useful.
    Their heuristics don't always do what I want, but after looking into it, it's not at all obvious how to improve upon them.

    ReplyDelete