Fixing GNU Mailman to handle mimetypes

I host a few neighborhood email lists on my Linux server running the excellent GNU Mailman list server software. Part of my setup involves stripping pictures/documents from emails and storing them in the list archives instead. This way 300 neighbors don’t get a 5 MB attachment emailed out to them: if anyone wants to view the picture/document all they have to do is click on a link in the original email and it will be fetched from the archives.

Tonight I noticed that the MIME type image/pjpeg wasn’t being properly parsed by Mailman’s Scrubber.py script. Having dealt with MIME type problems before, I suspected that the problem wasn’t with Mailman itself but the operating system’s definition of the MIME type.

Sure enough, checking the /etc/mime.types file revealed there was no image/pjpeg type defined. A little more Internet hunting brought me to this post on the Mailman list, confirming the missing mime.type info as the culprit:

On Jan 6, 2010, at 8:18 AM, Ralf Hildebrandt wrote:

> * Ralf Hildebrandt :
>> I have a list where the attachments are removed and stored on the
>> mailman server itself.
>>
>> This works like a charm, but SOME image attachments of the type:
>>
>> image/pjpeg
>>
>> are stored as “attachment.bin” instead of “attachment.jpg”
>>
>> Why?
>> Example below:
>
> adding “image/pjpeg” to /etc/mime.types fixed that:
>
> image/jpeg jpeg jpg jpe
> image/pjpeg jpeg jpg jpe

This is because Mailman uses Python’s mimetypes module to generate the file
name, and I believe that consults /etc/mime.types where available. Since
before you edit Python didn’t know anythig about image/pjpeg, it assumed it was
random binary data, hence the .bin suffix.

-Barry

From what I can find out, image/pjpeg is a type that Microsoft products choose to use instead of the image/jpeg that the rest of the world uses. I guess those crazy Redmonders are just trying to keep us on our toes, eh?

Posting boasting

I have to give a shout out to my blogging friend Chris O’Donnell on his 5,000th post. That’s a lot of blogging, and quite an accomplishment.

Personally, this marks my 4,553rd published post. I expect to break 5,000 within a year, though I have to say my increasing use of Twitter or Facebook has slowed my pace a bit. A quick update to those microblogs is like a quick scratch of an itch, whereas a blog post is like a full-blown bath in calamine lotion!

WordPress has Facebook-like link excerpting

Remember when I wished I had Facebook-like link excerpting in WordPress? It turns out I already do: it’s a bookmarklet built into WordPress called Press This.

Here’s how to use it:

In your WordPress Dashboard’s menu, click Tools. Drag the Press This link at the bottom of that page to your browser’s toolbar.

Now, when viewing a webpage that you’d like to add to your blog, simply highlight whatever text you’d like to include in your blog post and click on the Press This bookmarklet you just created. A new window will open up with your selected text already added to the editor and the title of the post set to the title of the webpage you were viewing. You can then adjust the text accordingly (add comments, etc.), and then click Publish. Super easy!

A big hat-tip to Scott Reston for pointing out this nifty feature!

Google background images irritate some

Google opted today to splash some color on its trusty, rusty search page using background images. Some aren’t so hip to the change, said by some to be a response to Microsoft’s BING search engine.

I’d be okay with the change as long as it didn’t slow down the loading of my Google page and I had the option to turn it off. While this could have been a welcome change, Google screwed up when it didn’t give users the ability to disable it.

Attention, Google: I use your search engine for the results it provides me, not because it’s pretty (or not pretty, as the case may be). Give your users the option to turn off the BING bling and everything will be cool.

Update 3 PM: Google listened, and now users can get the old-fashioned page back again. Thanks, Goog!

Adding Facebook’s link excerpt functionality to WordPress

One of the things that makes me more prone to update Facebook rather than my blog is the ease that Facebook’s user interface provides for quickly adding a link and a comment to that link. I click on the “link” box and my browser automatically loads an excerpt from the link’s page, including thumbnails from that page. I can add my commentary on the link in a few seconds and publish it to my Facebook wall.

Does anyone know if there is a WordPress plugin that implements link excerpting the way Facebook does? I often see interesting webpages and simply want to quickly share them without hassling with a WordPress editor to do so.

I’ve found plenty of plugins that implement some Facebook-ish functionality but nothing that does this exact thing. Evermore comes closest, but it excerpts one’s own blog posts, rather than something to which one is linking.

Let me know if anyone finds anything.

Update 21 June 2010: WordPress has this built in and I just now found it out. Hurray!

The ghosts of Children’s House

I’ve been checking the webserver logfiles here on MT.Net and note that a number of Google searches have brought people here looking for information on the Children’s House of Raleigh (CHR). Every time I discover someone else searching for that now-defunct school it makes me sad. Among other kids, our daughter got a great education at CHR. I felt a real kinship with the staff and other parents. Then the wheels came off. I’m not really sure what happened, but for whatever reason it just didn’t work out.

It’s tough to see something you poured love and work into come to an inglorious end.

Upping the spambot ante

This morning I was surprised to see that a spammer had apparently breached my WordPress anti-spambot gauntlet. What does this mean in English, you ask? A potential hacker actually succeeded in registering an account on MT.Net, from which he could potentially attack my website.

At first I thought a bot had solved my CAPTCHA challenge, but after looking at the log entries it does not appear that this was an automated attack. Some dumb schmuck actually typed in the code by hand. That’s what most visitors to my website do, but most people don’t do it using email and IP addresses associated with hackers.

I’ve since turned on SABRE’s RBL lookup tests. This will automatically check the incoming IP against a list of suspect addresses. If there’s a match, the rogue visitor get automatically booted before he even begins.

It’s not perfect security, but one part of many defenses needed to protect a website.

MSN can’t take no for an answer

Earlier this week I banned MSN’s msnbot from spidering my website. I did this with an entry in the robots.txt file:

User-Agent: msnbot
Disallow: /

I checked with MSN’s robots.txt verifier to make sure this would keep msnbot from spidering my site. The only problem is that I also blocked the MSN IP addresses. Thus msnbot couldn’t fetch robots.txt to tell it was no longer wanted.

So, I unblocked the IPs and allowed msnbot to grab the robots.txt file, which it did repeatedly (this is a small sample):
Continue reading