2012-08-17


Andreas 'ads' Scherbaum

Last night a long-running project of mine went live: pg_docbot v2.

For years, Jan Wieck provided a helper bot (rtfm_please) in the #postgresql IRC channel in the freenode network. Because of protocol changes in the freenode network, this bot was no longer functional. Together with some others we decided to write a quick and dirty new bot. As it is with dirty hacks, not everything was optimal: after timeouts the bot was not able to reconnect - more exactly the POE framework did not even recognize the timeout. Also extending the bot and adding new functionality was complicated. For a while I collected all these problems in my personal bugtracker and about two years ago I started a full rewrite.

Some of the new key features:

pg_docbot's channel limit is gone: a user in the freenode network can only join 20something channels, the new bot was designed from the ground to handle multiple IRC connections and circumvent this problem

function to identify stale urls: the new ?lost command shows all unconnected urls

registered users are now either "op" or "admin": all operators can issue ?learn and ?forget, admins can - of course - do everything

new command to post to all channels: the ?wallchan command let the doc post to all channels

i18n: every channel has a configured language, default is English - all messages in this channel are posted in the configured language (if translation is available)

watchdog on board: every session is monitored and reconnected, if necessary - no more "ads: can you please restart the bot?"

nickname handling: every session is monitoring his (registered) nickname and will reclaim the nick if necessary, also nickserv handling is included now

commands are recognized in different languages: a nice add-on, by-product of i18n, most commands can be used in different languages - like "search" (English) and "suche" (German)

bot can join and leave channels on the fly: not much to say about, just that you can have the bot in a temporary PostgreSQL channel if you like

channels can have paswords now: this works both for configured channels as well as on-the-fly joined channels

autojoin channels: configured but not joined channels are rejoined after a while, also it is possible to configure but not autojoin channels

statistics: the bot runs anonymous stats about his usage, like ?search, ?learn, ?forget and so on

There is still a lot to do, not all of my tickets are closed. If you want pg_docbot talking in your language, please send me translations. The pg_docbot code is on git.postgresql.org.

Next things on my todo list:

verify each URL from time to time: mark unreachable as invalid

intelligent sort order: not yet sure how to solve this problem, right now there is no specific sort order

move pg_docbot to PostgreSQL infrastructure

web interface: the bot should redirect the user to his website if there are more then let's say 2 or 3 urls, to avoid flooding the IRC channels

integration in postgresql.org website: the pg_docbot database contains very useful knowledge, there are plans to integrate this into the search on the main website

integration with explain.depesz.com: every time the bot see's a link from a paste site, it should scan the content and generate a postign on explain.depesz.com

monitor planet.postgresql.org: publish new postings in IRC channels

allow better search: like using a regexp

...

Continue reading "New PostgreSQL pg_docbot is live"

Show more