Find out how many unique hits on a site based on apache access log

This is a nifty little code snippet in perl which will list out the number of unique IP to hit your site and also the number of hits per IP

perl -e '$ip{(split)[0]}++ while <>; print map "$_ : $ip{$_}\n", sort {$ip{$b} <=> $ip{$a}} keys %ip' access.log

SFTP Chroot Jail on Ubuntu

This shows you how to let a user transfer files via sftp while blocking their access via ssh into the system. This is particularly useful if you are hosting multiple sites and want to give specific clients/users access to files only inside their site directory.

Create an sftp group

sudo groupadd sftp

Create a user

Assign a custom home directory for the new user we are going to add. In this case, their site directory: /srv/www/jondoesite.com/

sudo useradd -d /srv/www/johndoesite.com/ jdoe

Set their password

sudo passwd jdoe

Change the user’s primary group to the one we just created

sudo usermod -g sftp jdoe

Set their shell to /bin/false

sudo usermod -s /bin/false jdoe

Set Permissions

This will recursively make jdoe the owner of all files/folders in jondoesite.com/

chown jdoe:sftp -R johndoesite.com

But this next command will make sure root is still the owner of the parent directory (jondoesite.com). Also make sure all the folders above jondoesite.com are owned by root (in this case /srv/www/). This is necessary in order for jailing to work correctly.

chown root:root johndoesite.com

Configuring OpenSSH

sudo nano /etc/ssh/sshd_config

Scroll to bottom and add this while commenting out any other variations of these commands in their place:

Subsystem sftp internal-sftp
Match group sftp
ChrootDirectory %h
X11Forwarding no
AllowTcpForwarding no
ForceCommand internal-sftp

Restart ssh

service sshd restart

Test

If everything worked as expected, this should work

sftp [email protected]

But this should not

ssh [email protected]

Installing curb gem on Windows 7

You are reading this probably because you might have encountered an error like this on Windows:

Installing curb (0.7.18) with native extensions
Gem::Installer::ExtensionBuildError: ERROR: Failed to build gem native extension.

Here’s how to fix it (Note: This is assuming you have successfully installed RailsInstaller or similar.)

1) Download libcurl (under the “Win32 - Generic” section) and extract the contents to C:\ At the time of writing 7.27.0 was the latest. If you download a different version, don’t forget to the update the paths below.

2) Add C:\curl-7.27.0-devel-mingw32\bin to your Windows path

3) Run:

gem install curb --version 0.7.18 --platform=ruby -- -- --with-curl-lib="C:/curl-7.27.0-devel-mingw32/bin" --with-curl-include="C:/curl-7.27.0-devel-mingw32/include"

By the way, those multiple dashes are not a mistake! That’s the only way I could get it to work. You can change the version to meet your needs. I had another gem which was specifically dependent on version 0.7.18 so I choose that in particular.

CORS

I recently learned about CORS while creating an API for a web app. It stands for Cross-origin resource sharing. This is a magical new browser spec which defines a way for a web server (in this case an API) to talk to another web site on a different domain. In the past we’ve used various other techniques like JSONP and iFrames to get around this issue. Not anymore.

Here’s a code snippet I used to enable CORS in my django API.

Create a middleware class in django called /rest/middleware.py and tweak the ALLOWED_* constants to your needs:

import re

from django.utils.text import compress_string
from django.utils.cache import patch_vary_headers

from django import http

ALLOWED_ORIGINS = 'http://mydomain.com'
ALLOW_CREDENTIALS = 'true'
ALLOWED_METHODS = 'POST, GET, OPTIONS, PUT, DELETE'
ALLOWED_HEADERS = 'Origin, X-Requested-With, Content-Type, Accept'

class CORSMiddleware(object):
    """
        This middleware allows cross-domain XHR using the html5 postMessage API.

        eg.         
        Access-Control-Allow-Origin: http://foo.example
    """
    def process_request(self, request):

        if 'HTTP_ACCESS_CONTROL_REQUEST_METHOD' in request.META:
            response = http.HttpResponse()
            response['Access-Control-Allow-Origin'] = ALLOWED_ORIGINS
            response['Access-Control-Allow-Credentials'] = ALLOW_CREDENTIALS
            response['Access-Control-Allow-Methods'] = ALLOWED_METHODS
            response['Access-Control-Allow-Headers'] = ALLOWED_HEADERS
            return response

        return None

    def process_response(self, request, response):
        # Avoid unnecessary work
        if response.has_header('Access-Control-Allow-Origin'):
            return response

        response['Access-Control-Allow-Origin'] = ALLOWED_ORIGINS
        response['Access-Control-Allow-Credentials'] = ALLOW_CREDENTIALS
        response['Access-Control-Allow-Methods'] = ALLOWED_METHODS
        response['Access-Control-Allow-Headers'] = ALLOWED_HEADERS
        return response

Then just add that to your settings.py:

.
.
MIDDLEWARE_CLASSES = (
    .
    .
    'rest.middleware.CORSMiddleware'
    )
.
.

That’s it! Also, http://enable-cors.org/ is a great resource if you are looking for code samples and guidelines on how to go about implementing CORS for other languages/platforms.

But wait, not too fast. As always, IE stands in the way of creating elegant web applications. IE6 and IE7 both lack CORS support, while IE8 and IE9 have broken implementations. IE10 is the only version with a non-buggy CORS implementation. If you want to achieve cross browser compatibility you would have to fall back to JSONP which is very limited and only supports GET requests or use some iFrame magic like I talked about in my previous blog post.

Luckily I was able to find a better option. A bit more Googling let me to this JavaScript library which lets you seamlessly make cross-browser cross-domain AJAX requests across the board without major hacks. So how does it do it? You can read this in-depth explanation on his website but basically it relies on CORS for modern browsers which support it while using flXHR - a cross-domain AJAX shim (written in Flash+JS) on older browsers as a fallback to essentially achieve the same goal.

So here’s to writing more elegant APIs and web applications!

Tips and Tricks for Django on Google App Engine

Recently I was tasked with creating a dajngo app which runs on Google App Engine (GAE). Now anyone familiar with GAE knows that they don’t fully support relational/SQL storage just yet (Google Cloud SQL is in an experimental phase at this time). Instead they have something called App Engine Datastore which is a schemaless object storage (NoSQL of sorts). So to get django to play nice with this storage, good people at ABP created a fork - django-nonrel which runs seamlessly out of the box on GAE. Easy enough.

Although there were couple of things which I tried to do that were not so straight forward. I’m just going to document those here in hopes that someone will find it useful.

Uploading Images

First up, how about uploading images?

Well in that case you’ll have to install the following libraries:

django-filetransfers - It’s an abstraction layer which allows for uploads into remote datastores, like AppEngine or S3 using django’s standard models.FileField

But what if you wanted to use models.ImageField instead of models.FileField? Since ImageField depends on PIL which is not available on GAE, it won’t work out of the box. You will need to install a mock PIL class library which handles the 3 functions necessary for the ImageField to validate: open(), read(), and verify().

Next up, what if you wanted to upload images through django admin? Well in most likelihood you probably encountered an error like this while trying to do so:

ValueError: The App Engine storage backend only supports BlobstoreFile instances or File instances whose file attribute is a BlobstoreFile.

Luckily I found a fork of the filetransfers library here which adds support for the admin. So instead of using the original package, just use the fork. Then all you have to do is, inherit the admin class from FiletransferAdmin in your admin.py:

from filetransfers.admin import FiletransferAdmin
from myapp.models import MyModel

class MyModelAdmin(FiletransferAdmin):    
    pass

admin.site.register(MyModel, MyModelAdmin)

Volia! That’s all you need to do to get image uploads working correctly using django-nonrel admin on GAE!

Search

Due to the NoSql nature of the GAE storage some of the things you would expect like case-insensitive queries (iexact, istartswith, etc.), JOINs, etc. won’t work as expected. ABP strikes once again with dbindexer which abstracts away most of those differences and lets you use some of that functionality just like how you would in regular django. Also, if you are looking for full-text search support, there’s a great package for that as well - nonreal-search

Also keep in mind that ABP no longer supports or maintains any of these libraries so it’s best to get the latest source from the official Github account - https://github.com/django-nonrel

Hopefully in the future all this wont be necessary as GAE support for django becomes more robust. Although you can still give it a spin while its still experimental: https://developers.google.com/appengine/docs/python/cloud-sql/django

Happy hacking!