Amazon Web Services offers a service called Simple Queue Service (SQS) which makes it easy to decouple and scale your asynchronous system compoents. Messages in SQS will be retried on failure, and will retry once per message visibility timeout (default at 30 seconds). However,
a common practice is to use exponential backoff instead of constant wait times for better flow control (see Error Retries and Exponential Backoff in AWS and the wikipedia page on Exponential backoff).
This post gives you a sample implementation on how to implement exponential backoff in your SQS consumer.
Read more
Posted Sun 17 June 2018
by Ivan Dyedov
in Java
(Java, AWS)
Comments
There are lots of documentation and examples around uploading and downloading files to/from S3.
A lot of times you don't actually want to keep around the files you upload or download, and want to delete them as soon as that process is done.
You can easily do this using temp files like this:
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.GetObjectRequest;
import java.io.*;
class S3TempFileTest {
private static final String S3_FILE_PREFIX = "s3test";
private static final String S3_FILE_SUFFIX = ".tmp";
private static final String S3_BUCKET_NAME = "bucket";
private static final String S3_KEY_NAME = "key";
private static final AmazonS3 AMAZON_S3 = new AmazonS3Client();
public void testUploadWithTempFile() throws IOException {
File tempFile = File.createTempFile(S3_FILE_PREFIX, S3_FILE_SUFFIX);
// writeContent(tempFile)
try {
AMAZON_S3.putObject(S3_BUCKET_NAME, S3_KEY_NAME, tempFile);
} finally {
tempFile.delete();
}
}
public void testDownloadWithTempFile() throws IOException {
File tempFile = File.createTempFile(S3_FILE_PREFIX, S3_FILE_SUFFIX);
try {
AMAZON_S3.getObject(new GetObjectRequest(S3_BUCKET_NAME, S3_KEY_NAME), tempFile);
// process(tempFile)
} finally {
tempFile.delete();
}
}
}
However, code like this is not easily testable as you're interacting with the file system,
and is also not the most efficient if you're dealing with small files because writing to hard disk is slow (compared to memory IO).
Here is an example on how you could do it all in memory without using temp files. The example uses Jackson-serialized JSON content,
but it applies to any kind of file.
Read more
Posted Fri 06 November 2015
by Ivan Dyedov
in Java
(Java, AWS)
Comments
There are several documented ways of setting up autoscaling groups with the help of ansible, including the official ansible docs.
EC2 auto-scaling does, however, assume that servers launched with the AMI will be ready to serve traffic,
meaning that the AMI has to be pre-baked (with ansible or other tools) and tightly couples code with the AMI.
So, you would have to re-build the AMI for every code release, which is less than ideal in certain environments -
particularly when deploys happen more often and you have a large number of servers.
I posted how you could re-use the same AMI to bootstrap Chef in an EC2 auto-scaling group in a previous post,
and it turns out you can apply the same concepts when bootstrapping nodes with ansible.
Read more
Posted Sun 26 October 2014
by Ivan Dyedov
in AWS
(Amazon Web Services, Ubuntu, ansible, autoscale, Python)
Comments
According to redis documentation redis keys with an expiration are evicted either in a passive way when they are accessed or
in an active way where every 0.1 seconds redis looks at 100 random keys and evicts the ones that are expired. If at least 25/100
keys were evicted it tries again.
Unfortunately it seems that for a write-heavy application this probabilistic algorithm is still not sufficient to keep memory
usage steady, it just continued to grow over time for us. So, we were forced to manage expirations ourselves. One way would be writing
keys written into a separate data structure per hour, and then periodically going through those keys to force eviction. This seems
like a big overhead to store all keys twice.
Another way is to iterate through the keys in redis. Redis recently added a SCAN command that allows you to do
just that, and in the process evict keys that have expired. Below is a simple python script that illustrates this approach.
Read more
Posted Sat 28 June 2014
by Ivan Dyedov
in Python
(Linux, Redis, Python)
Comments
I was recently playing around with a Foscam FI9821W V2 webcam, and found its interface quite lacking in features.
For example, it has no option of syncing alarm videos to any remote server. So it would just record until your SD
card fills up at which point you would have to manually free up space.
Luckily, it also runs an (undocumented) FTP server on port 50021 where you can access and manage all of SD card
contents. So, you can just run the following command as often as you like to move all new videos to your NAS.
Read more
Posted Sat 07 December 2013
by Ivan Dyedov
in Linux
(Linux)
Comments
Here's how to install munin-node and start monitoring a MongoDB server on a fresh Ubuntu box
sudo apt-get install git munin-node
git clone git://github.com/erh/mongo-munin.git /home/ubuntu/mongo-munin
sudo ln -sf /home/ubuntu/mongo-munin/mongo_btree /etc/munin/plugins/mongo_btree
sudo ln -sf /home/ubuntu/mongo-munin/mongo_conn /etc/munin/plugins/mongo_conn
sudo ln -sf /home/ubuntu/mongo-munin/mongo_lock /etc/munin/plugins/mongo_lock
sudo ln -sf /home/ubuntu/mongo-munin/mongo_mem /etc/munin/plugins/mongo_mem
sudo ln -sf /home/ubuntu/mongo-munin/mongo_ops /etc/munin/plugins/mongo_ops
sudo service munin-node restart
You can then test it by running:
sudo munin-run mongo_conn
Read more
Posted Wed 05 June 2013
by Ivan Dyedov
in Linux
(Linux, Ubuntu)
Comments
While you're developing and debugging your WSGI application there's lots of ways to automatically reload your code on change out of the box.
For example, if you're using werkzeug you can just pass the use_reloader flag:
run_sumple('127.0.0.1', 5000, app, use_reloader=True)
For Flask, which actually uses werzeug internally, setting debug=True is all you need:
Django will automatically do it for you when you use:
All of these examples work great while developing locally, however, they are greatly discouraged against being used in production.
So the question arises, what do you do to automatically reload your code in production?
Read more
Posted Sun 07 April 2013
by Ivan Dyedov
in Python
(Python, uWSGI, gunicorn, WSGI)
Comments
There are lots of ways for setting up auto-scaling for EC2 nowadays,
there's Amazon's own products like the recently announced AWS OpsWorks
and CloudFormation. The benefit of using these tools is
integration with other AWS services. But, there's also downsides, as
OpsWorks cannot integrate with ELB currently, and using CloudFormation
will probably involve you writing funky JSON templates.
There's also third-party solutions, like open-source Asgard from
Netflix and rightscale, an enterprise cloud management service.
These services can also be used for some basic configuration management,
though I feel that is not their primary purpose. We chose to go with a
separate solutions for that - Opscode Chef.
There are lots of guides on how to set up EC2 auto-scaling, as well as
guides on integrating Chef with CloudFormation, like Amazon's own
docs, however there isn't much information on how to do this without
CloudFormation. Specifically, if you just want auto-scaling without the
extra complexity of CloudFormation and still want to use Chef for
configuration management, here's what you need to do.
Read more
Posted Thu 28 February 2013
by Ivan Dyedov
in AWS
(Amazon Web Services, Ubuntu, Opscode Chef, autoscale)
Comments
Following my previous benchmark I finally got around to benchmarking
uWSGI with gevent and comparing its performance to gunicorn with gevent
worker type. To do this you have to compile uWSGI and gevent from
source, so I used the latest tagged releases at the time of the test,
uWSGI 1.3 and gevent 1.0b4.
As it turns out, performance of the two servers is almost identical when
using gevent...
Read more
Posted Sun 14 October 2012
by Ivan Dyedov
in Python
(Python, WSGI, gevent, gunicorn, uWSGI, benchmark, nginx, Ubuntu, Linux)
Comments
All of the WSGI benchmarks I found were pretty outdated or didn't
include async results, so I decided to do some benchmarking myself.
I know there's other options for running python WSGI applications, but I
settled on just 2: gunicorn, which has the advantage of being
pure-python and uWSGI, which has the advantage of being pure-C.
Read more
Posted Wed 12 September 2012
by Ivan Dyedov
in Python
(Python, Ubuntu, WSGI, uWSGI, gunicorn, benchmark, nginx, gevent, eventlet)
Comments
Oracle has recently disallowed direct downloads of java from their servers (without going through the browser and agreeing to their terms, which you can look at here:
Oracle terms). So, if you try:
wget "http://download.oracle.com/otn-pub/java/jdk/7u4-b20/jdk-7u4-linux-x64.tar.gz"
you will receive a page with "In order to download products from Oracle Technology Network you must agree to the OTN license terms" error message.
This can be rather troublesome for setting up servers with automated scripts.
Luckily, it seems that a single cookie is all that is needed to bypass this (you still have to agree to the terms to install):
Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie
So, if you want to download jdk7u4 for 64-bit Linux (e.g., Ubuntu) using wget, you can use:
wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u4-b20/jdk-7u4-linux-x64.tar.gz"
Read more
Posted Sat 05 May 2012
by Ivan Dyedov
in Java
(Java, Ubuntu, Linux)
Comments
If you have a multi-level dictionary in python looking like:
tree = {
"first1": {
"second1": {
"third1": {
"leaf1": 123,
"leaf2": 234,
"leaf3": 345
}
},
"second2": {
"leaf4": 456,
"leaf5": 567
}
},
"leaf6": 678
}
Here's a way to recursively count the number of elements, or "leaves" in
this so-called dictionary tree (a single leaf being anything that's not
a dictionary):
Read more
Posted Wed 03 August 2011
by Ivan Dyedov
in Python
(Python)
Comments
Here's the easiest way to install Sun (Oracle) Java JRE on newer
versions of Ubuntu (tested on 11.04) without resorting to third party
PPAs.
sudo apt-add-repository "deb http://archive.canonical.com/ natty partner"
sudo apt-get update
sudo apt-get install sun-java6-jre
Read more
Posted Sat 16 July 2011
by Ivan Dyedov
in Linux
(Java, Ubuntu, Linux)
Comments
There's lots of information online on how to move MySQL data and logs to
a new location but I had to use a combination of several guides in order
to get it fully working. This is a reference for me if I ever need to do
it again or for anyone else that may find it useful.
Read more
Posted Sat 08 January 2011
by Ivan Dyedov
in Linux
(MySQL, Ubuntu, Linux)
Comments
I recently had to sign a Mozilla Firefox extension using a VeriSign Code
Signing certificate. The process to receive the cert is pretty
straightforward - you apply for the certificate on VeriSign's page
where you input your company details and payment information. By the
way, you can use "THEDEAL99" promo code to get $400 off $499 for a
Microsoft© Authenticode© certificate to make the price somewhat
reasonable. After your application is submitted they verify the validity
of your company and the information you put in and issue you the
certificate that you can use for code signing.
Read more
Posted Tue 16 November 2010
by Ivan Dyedov
in Linux
(Cryptography, XPI, code signing, verisign, Linux)
Comments