Trace your referrers in real-time

One of the magic things about Dreamhost is the ability to login to your account using ssh. Apart from standard things that you can do with it, like compile and install any program that you like (excluding those, which require root access, of course), you can also experience a little bit of magic if you run this command:

tail -f current-httpd-accesslog

There is really something special about tracking your website’s visitors as they come. And, of course, the most interesting thing is actually knowing where they come from (also known as referrers). The problem with this command is that it prints lots of garbage on the screen (like timestamps, response codes, sizes, etc.). The problem is that you cannot grep your real-time tailed log. Why? Don’t ask me, I’m not a Linux guru. You just can’t and that’s it. You can, however, write a script, which deals with that in its own way. And this is what I’ve been writing for the whole day. It was both fun and painful to learn for the n-th time all those shell hacks and quirks. Was it worth it? Sure it was! As a result I came up with this little bash script to trace your referrers in real-time. You can download it or view it below. I must warn you, though. It is highly addictive. Really…

#!/bin/bash

# ========================================================================
# REAL-TIME WEBSITE REFERERS TRACER
# ========================================================================
#
# What?
#   Real-time website referrers tracer is a shell script that lets you
#   trace your visitors as they come. Script works in an ultra compact
#   four-columns view :)
#
# Why?
#   Because you cannot do 'tail -f access_log | grep something' and you
#   really want to grep out most of the stuff that your httpd puts in
#   the logs.
#
# Requirements:
#   - website (with not too low and not too high traffic),
#   - shell account on the server where your website is hosted,
#   - access to httpd logs that use the COMBINED format.
#
# Installation:
#   - copy anywhere in your home directory,
#   - edit the script and set the 'log' variable so it actually
#     points to your current httpd log,
#   - make sure the script has execute rights (chmod +x trace-referers).
#
# Running:
#   - just run the script and watch the screen.
#
#
# Version: 0.2 (2005-06-18)
# Author: Paweł Gościcki, http://pawelgoscicki.com
#
# No copyright rights. You can do whatever you want with this. You may even
# claim this scrip has been written by you from the very beginning ;)
#
# If you, however, improve it, send me a copy (paul_AT_pawelgoscicki.com).
#
# Based heavily on the tgrep script by Ed Morton (morton_at_lsupcaemnt.com):
# http://unix.derkeiler.com/Newsgroups/comp.unix.shell/2004-01/0818.html


# CONFIGURATION
# =============

# Where your httpd log file is
log="current-httpd-accesslog"

# What files to exclude (request for those files won't be shown, regexp syntax)
exclude="\.gif|\.jpg|\.png|\.ico|\.css|\.js"

# Width of request and referrer columns (set it to match your terminal's width)
col_width=35


# MAIN SCRIPT
# ===========

# Check if log file actually exists (and is readable)
if [ ! -r "${log}" ]; then
echo "Cannot access log file: $log"
exit 0
fi

# After startup we will output few lines
start=`wc -l < "${log}"`
start=$(( $start - 30 ))
if (( ${start} < 0 ))
then start=$((0))
fi

# Main loop
while :
do
  end=`wc -l < "${log}"`
  end="${end##* }"
  if (( ${end} > ${start} ))
  then
    start=$(( $start + 1 ))
    sed -n "${start},${end}p" "${log}" | egrep -v "${exclude}" | \
    awk -v col_width=$col_width '{

      # we are only interested in GET/POST requests
      if ( match($0, /\"(GET|POST).*?\"/) > 0 )
      {
        split($0, fields, "\"")

        # IP_ADDRESS
        tmp = $1
        while ( length(tmp) < 15 ) tmp = tmp " "
        printf "%s", tmp " "
    
        # HTTP_REQUEST (GET/POST)
        tmp = substr(fields[2], 0, index(fields[2], "HTTP/") - 1 )
        tmp = substr(tmp, index(tmp, " ") + 1, col_width)
        while ( length(tmp) < col_width ) tmp = tmp " "
        printf "%s", tmp " "
    
        # REFERER (the juice)
        tmp = fields[4]
        while ( length(tmp) < col_width ) tmp = tmp " "
        printf "%s", tmp " "
    
        # USER_AGENT
        printf "%s", fields[6]
    
        # new line at the end
        printf "\n"
      }
    }'

    start=${end}
  fi

  # this is an endless loop executed every second
  sleep 1
done

Your current hosting provider does not support ssh access? You might then want to read my other post about hosting with dreamhost for as little as 9$/year. Have fun!

Rolling with Ruby on Rails on Dreamhost

As of yet it is unofficial, but Dreamhost has indeed added support for Ruby on Rails (v0.12.1) together with FastCGI!

I have just tested it briefly and it works as it is supposed to.

If you want to turn FastCGI on, you must log in into your Dreamhost panel and go to Domains -> Web. Single checkbox and FastCGI is enabled. Couldn’t be simpler.

Dreamhost, regarded by many as the hosting company is indeed staying on the cutting edge, being the first major player to support Rails.

UPDATE: It’s now official.

(via)

Apple and Intel merger?

Apparently there seems be some hidden truth beneath the Apple’s decision to switch to the Intel chips. Robert X. Cringely suggests that it might be that Intel is planning to buy out Apple (just another word for merger). I must say that he uses some really convincing arguments. On the other hand the rumor that Intel might be producing PowerPC chips for Apple was very sensible and convincing too.

Here are some excerpts:

If Apple is willing to embrace the Intel architecture because of its performance and low power consumption, then why not go with AMD, which equals Intel’s power specs, EXCEEDS Intel’s performance specs AND does so at a lower price point across the board? Apple and AMD makes far more sense than Apple and Intel any day.

(…)

The vaunted Intel roadmap is nice, but no nicer than the AMD roadmap, and nothing that IBM couldn’t have matched. If Apple was willing to consider a processor switch, moving to the Cell Processor would have made much more sense than going to Intel or AMD, so I simply have to conclude that technology has nothing at all to do with this decision. This is simply about business – BIG business.

Read the full article: Going for Broke.

Vendor lock out

We are all familiar with the term vendor lock in (and it’s no news that it’s mainly associated with Microsoft), but a vendor lock out? Apparently this is what FeedBurner is doing with their latest offer:

Let’s say you decide you want to stop using FeedBurner. You loved the services, you loved the customer support, you loved everything about FeedBurner, but let’s face it: you’re going crazy with all the delightful services, and you’ve decided you can’t take it anymore. You want out. You’ve always been able to do this if you run your own server: just like you redirect your feed traffic to FeedBurner, you can redirect your traffic away from FeedBurner. No problem.

The only disconcerting fact is that they provide the return redirect for 31 days. After this period they delete your feed/redirect permanently. Other than that – they play very nice. Who knows, maybe I’ll switch my RSS feeds to them as well?

My real name is Paweł Gościcki

“Who is Paweł Gościcki then?” – you might ask. And I will gladly answer you. He is what I like to call my English alter ego. And no, I don’t have a split personality disorder, no matter what you might think. I use Paul instead of Pawel, because it is the English equivalent of my Polish name (yes, I am from Poland) and it is in fact a blog written entirely in English. Using Paul here just seems more natural to me.

Apple confirmed the switch to Intel

Apparently the rumors were true and we will see Pentium based Macs. We could already see a 3.4GHz P4 running Mac OS X at the WWDC conference just yesterday. At least people were saying that it was actually a P4 as I did not see it with my own eyes. It is definitely a huge change for Apple. What I really would like to know is if I will ever be able to install Mac OS X on my x86 processor as a secondary system, next to my beloved & hated Windows XP. Although many people say that it won’t happen, I certainly hope that it will. Either officially or using some ugly and nasty hacks. But it doesn’t matter. I just want to have my yours Mac OS X!

The demise of Google?

Is it time for Yahoo now? Who knows. All I know is that I receive about 2,5x more juice from Yahoo than from Google. This blog is a nice example of how search engines deal with new content. On the one hand my blog is relatively new and not linked-to, besides one link from Max Thrane and few others left in comments on other blogs. On the other hand it has good and relevant content (the crazy frog video) that people search for, find and download (it generates about 10GB of traffic a day). Do more people use Yahoo? Hardly so. All comes down to the fact that Yahoo lists my blog higher than Google for various crazy frog queries.

Should they list my blog high for those queries? Sure. In fact I should be at least in the top three results for this query as my blog is actually one of the three websites which host it (at least the high bandwidth version). This means that my content is 100% relevant. I’m ranked at number 6 on Yahoo for this query (which isn’t that bad after all) and outside of the top 200 on Google. This says a lot – mostly along the lines of “you are not good enough to be a Google citizen”.

Coincidentally, Marek Futrega writes about the Google’s outdated index for special link: queries. Let’s see who actually links to my blog: Google, Yahoo and MSN. The results speak for themselves.

Big changes are always initiated by small symptoms. Just like those described above. Is it the indication of Google’s demise? Maybe so, maybe not, but it clearly gives their competitors the advantage to shine in those areas (and take away a small piece of pie from them).

Crazy Frog – Axel F

Crazy Frog himself

Crazy Frog (aka the most annoying thing in the world) ringtone tops the British charts.

I have absolutely no idea why people find it so annoying! I don’t find it annoying at all. Quite the opposite – it’s very funny!

I’ve spent a little bit of time searching for the full video, as it was taken down from its creators website – Kaktus Film, so I decided to upload it here and share it with you.

As a bonus, you can also watch the video that started it all. I remeber watching it quite a while ago. It was damn funny at the time (and still is)!

Here are the links to all of those files:

There’s also a trailer for the sitcom A Bear’s Tail, featuring the Crazy Frog. It’s under the very meaningful title “Fuck crazy frog”. Here’s the link:

You might also want to check out the Crazy Frog – Popcorn Video post.

Can’t not create

Despite being written in 2000, it is still valid. Probably even more now than it was back then. And I could not agree more with Mark Pilgrim’s words:

As I write this, the year is 2000, and the Internet is a battleground of intellectual property disputes. Some people would like you to believe that, without proper financial incentives, music, literature, and computer software would disappear. After all, who would make music if they can’t make money on it? Who would write? Who would program? I know the answer. The answer is that musicians will make music, not because they can make money, but because musicians are the people who can’t not make music. Writers will write because they can’t not write. I’ve been programming for 16 years, writing free software for 8. I can’t imagine not doing this. If you can imagine yourself not doing what you’re doing, do something else. Do whatever it is that you can’t not do.

You don’t need money to create art. No matter what kind of art it is. Will we see the end of big record companies and even bigger movie studios? Or putting it another way, will we see more and more independent (and without any copyright burden) work gaining its momentum and being a commercial success for its creators? I certainly hope so.

(via plasmasturm)