Fork me on GitHub

Ivan-Site.com

Interact with S3 Without Temp Files

There are lots of documentation and examples around uploading and downloading files to/from S3. A lot of times you don't actually want to keep around the files you upload or download, and want to delete them as soon as that process is done. You can easily do this using temp files like this:

import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.GetObjectRequest;

import java.io.*;

class S3TempFileTest {

    private static final String S3_FILE_PREFIX = "s3test";
    private static final String S3_FILE_SUFFIX = ".tmp";
    private static final String S3_BUCKET_NAME = "bucket";
    private static final String S3_KEY_NAME = "key";

    private static final AmazonS3 AMAZON_S3 = new AmazonS3Client();

    public void testUploadWithTempFile() throws IOException {
        File tempFile = File.createTempFile(S3_FILE_PREFIX, S3_FILE_SUFFIX);
        // writeContent(tempFile)
        try {
            AMAZON_S3.putObject(S3_BUCKET_NAME, S3_KEY_NAME, tempFile);
        } finally {
            tempFile.delete();
        }
    }

    public void testDownloadWithTempFile() throws IOException {
        File tempFile = File.createTempFile(S3_FILE_PREFIX, S3_FILE_SUFFIX);
        try {
            AMAZON_S3.getObject(new GetObjectRequest(S3_BUCKET_NAME, S3_KEY_NAME), tempFile);
            // process(tempFile)
        } finally {
            tempFile.delete();
        }
    }
}

However, code like this is not easily testable as you're interacting with the file system, and is also not the most efficient if you're dealing with small files because writing to hard disk is slow (compared to memory IO).

Here is an example on how you could do it all in memory without using temp files. The example uses Jackson-serialized JSON content, but it applies to any kind of file.

Read more

Posted Fri 06 November 2015 by Ivan Dyedov in Java (Java, AWS)

Comments

Auto-scaling on Amazon EC2 with Ansible

There are several documented ways of setting up autoscaling groups with the help of ansible, including the official ansible docs. EC2 auto-scaling does, however, assume that servers launched with the AMI will be ready to serve traffic, meaning that the AMI has to be pre-baked (with ansible or other tools) and tightly couples code with the AMI. So, you would have to re-build the AMI for every code release, which is less than ideal in certain environments - particularly when deploys happen more often and you have a large number of servers.

I posted how you could re-use the same AMI to bootstrap Chef in an EC2 auto-scaling group in a previous post, and it turns out you can apply the same concepts when bootstrapping nodes with ansible.

Read more

Posted Sun 26 October 2014 by Ivan Dyedov in AWS (Amazon Web Services, Ubuntu, ansible, autoscale, Python)

Comments

Force-evict expired keys from redis

According to redis documentation redis keys with an expiration are evicted either in a passive way when they are accessed or in an active way where every 0.1 seconds redis looks at 100 random keys and evicts the ones that are expired. If at least 25/100 keys were evicted it tries again.

Unfortunately it seems that for a write-heavy application this probabilistic algorithm is still not sufficient to keep memory usage steady, it just continued to grow over time for us. So, we were forced to manage expirations ourselves. One way would be writing keys written into a separate data structure per hour, and then periodically going through those keys to force eviction. This seems like a big overhead to store all keys twice.

Another way is to iterate through the keys in redis. Redis recently added a SCAN command that allows you to do just that, and in the process evict keys that have expired. Below is a simple python script that illustrates this approach.

Read more

Posted Sat 28 June 2014 by Ivan Dyedov in Python (Linux, Redis, Python)

Comments

Sync Videos from a Foscam Webcam

I was recently playing around with a Foscam FI9821W V2 webcam, and found its interface quite lacking in features. For example, it has no option of syncing alarm videos to any remote server. So it would just record until your SD card fills up at which point you would have to manually free up space.

Luckily, it also runs an (undocumented) FTP server on port 50021 where you can access and manage all of SD card contents. So, you can just run the following command as often as you like to move all new videos to your NAS.

Read more

Posted Sat 07 December 2013 by Ivan Dyedov in Linux (Linux)

Comments

Monitoring MongoDB in Munin on Ubuntu 13.04

Here's how to install munin-node and start monitoring a MongoDB server on a fresh Ubuntu box

sudo apt-get install git munin-node
git clone git://github.com/erh/mongo-munin.git /home/ubuntu/mongo-munin
sudo ln -sf /home/ubuntu/mongo-munin/mongo_btree /etc/munin/plugins/mongo_btree
sudo ln -sf /home/ubuntu/mongo-munin/mongo_conn /etc/munin/plugins/mongo_conn
sudo ln -sf /home/ubuntu/mongo-munin/mongo_lock /etc/munin/plugins/mongo_lock
sudo ln -sf /home/ubuntu/mongo-munin/mongo_mem /etc/munin/plugins/mongo_mem
sudo ln -sf /home/ubuntu/mongo-munin/mongo_ops /etc/munin/plugins/mongo_ops
sudo service munin-node restart

You can then test it by running:

sudo munin-run mongo_conn

Posted Wed 05 June 2013 by Ivan Dyedov in Linux (Linux, Ubuntu)

Comments

Autoreload Code in Python

While you're developing and debugging your WSGI application there's lots of ways to automatically reload your code on change out of the box. For example, if you're using werkzeug you can just pass the use_reloader flag:

run_sumple('127.0.0.1', 5000, app, use_reloader=True)

For Flask, which actually uses werzeug internally, setting debug=True is all you need:

app.run(debug=True)

Django will automatically do it for you when you use:

manage.py runserver

All of these examples work great while developing locally, however, they are greatly discouraged against being used in production. So the question arises, what do you do to automatically reload your code in production?

Read more

Posted Sun 07 April 2013 by Ivan Dyedov in Python (Python, uWSGI, gunicorn, WSGI)

Comments

Auto-scaling on Amazon EC2 with Opscode Chef

There are lots of ways for setting up auto-scaling for EC2 nowadays, there's Amazon's own products like the recently announced AWS OpsWorks and CloudFormation. The benefit of using these tools is integration with other AWS services. But, there's also downsides, as OpsWorks cannot integrate with ELB currently, and using CloudFormation will probably involve you writing funky JSON templates.

There's also third-party solutions, like open-source Asgard from Netflix and rightscale, an enterprise cloud management service.

These services can also be used for some basic configuration management, though I feel that is not their primary purpose. We chose to go with a separate solutions for that - Opscode Chef.

There are lots of guides on how to set up EC2 auto-scaling, as well as guides on integrating Chef with CloudFormation, like Amazon's own docs, however there isn't much information on how to do this without CloudFormation. Specifically, if you just want auto-scaling without the extra complexity of CloudFormation and still want to use Chef for configuration management, here's what you need to do.

Read more

Posted Thu 28 February 2013 by Ivan Dyedov in AWS (Amazon Web Services, Ubuntu, Opscode Chef, autoscale)

Comments

gevent: gunicorn vs uWSGI

Following my previous benchmark I finally got around to benchmarking uWSGI with gevent and comparing its performance to gunicorn with gevent worker type. To do this you have to compile uWSGI and gevent from source, so I used the latest tagged releases at the time of the test, uWSGI 1.3 and gevent 1.0b4.

As it turns out, performance of the two servers is almost identical when using gevent...

Read more

Posted Sun 14 October 2012 by Ivan Dyedov in Python (Python, WSGI, gevent, gunicorn, uWSGI, benchmark, nginx, Ubuntu, Linux)

Comments

Benchmark uWSGI vs gunicorn for async workers

All of the WSGI benchmarks I found were pretty outdated or didn't include async results, so I decided to do some benchmarking myself.

I know there's other options for running python WSGI applications, but I settled on just 2: gunicorn, which has the advantage of being pure-python and uWSGI, which has the advantage of being pure-C.

Read more

Posted Wed 12 September 2012 by Ivan Dyedov in Python (Python, Ubuntu, WSGI, uWSGI, gunicorn, benchmark, nginx, gevent, eventlet)

Comments

Download Oracle Java JRE & JDK using a script

Oracle has recently disallowed direct downloads of java from their servers (without going through the browser and agreeing to their terms, which you can look at here: Oracle terms). So, if you try:

wget "http://download.oracle.com/otn-pub/java/jdk/7u4-b20/jdk-7u4-linux-x64.tar.gz"

you will receive a page with "In order to download products from Oracle Technology Network you must agree to the OTN license terms" error message.

This can be rather troublesome for setting up servers with automated scripts.

Luckily, it seems that a single cookie is all that is needed to bypass this (you still have to agree to the terms to install):

Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie

So, if you want to download jdk7u4 for 64-bit Linux (e.g., Ubuntu) using wget, you can use:

wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u4-b20/jdk-7u4-linux-x64.tar.gz"

Read more

Posted Sat 05 May 2012 by Ivan Dyedov in Java (Java, Ubuntu, Linux)

Comments

Counting leaves with Python

If you have a multi-level dictionary in python looking like:

tree = {
    "first1": {
        "second1": {
            "third1": {
                "leaf1": 123,
                "leaf2": 234,
                "leaf3": 345
            }
        },
        "second2": {
            "leaf4": 456,
            "leaf5": 567
        }
    },
    "leaf6": 678
}

Here's a way to recursively count the number of elements, or "leaves" in this so-called dictionary tree (a single leaf being anything that's not a dictionary):

Read more

Posted Wed 03 August 2011 by Ivan Dyedov in Python (Python)

Comments

Installing Sun Java JRE on Ubuntu 11.04

Here's the easiest way to install Sun (Oracle) Java JRE on newer versions of Ubuntu (tested on 11.04) without resorting to third party PPAs.

sudo apt-add-repository "deb http://archive.canonical.com/ natty partner"
sudo apt-get update
sudo apt-get install sun-java6-jre

Read more

Posted Sat 16 July 2011 by Ivan Dyedov in Linux (Java, Ubuntu, Linux)

Comments

Signing an XPI using a VeriSign Code Signing certificate

I recently had to sign a Mozilla Firefox extension using a VeriSign Code Signing certificate. The process to receive the cert is pretty straightforward - you apply for the certificate on VeriSign's page where you input your company details and payment information. By the way, you can use "THEDEAL99" promo code to get $400 off $499 for a Microsoft© Authenticode© certificate to make the price somewhat reasonable. After your application is submitted they verify the validity of your company and the information you put in and issue you the certificate that you can use for code signing.

Read more

Posted Tue 16 November 2010 by Ivan Dyedov in Linux (Cryptography, XPI, code signing, verisign, Linux)

Comments