This text explains how to configure nginx and vaurien to build a local mirror of the data to run the flag download examples while avoiding network traffic and introducing controlled delays and errors for testing, thanks to the vaurien proxy.
The flag download examples are designed to compare the performance of different approaches to finding and downloading files from the Web. However, we don't want to hit a public server with multiple requests per second while testing, and we want to be able to simulate high latency and random network errors.
For this setup I chose nginx as the HTTP server because it is very fast and easy to configure, and the vaurien proxy because it was designed by Mozilla to introduce delays and network errors for testing.
The archive flags.zip, contains a directory flags/ with 194
subdirectories, each containing a .gif` image and a ``metadata.json file.
These images are public-domain flags copied from the CIA World Fact Book [1].
[1] https://www.cia.gov/library/publications/the-world-factbook/
Once these files are unpacked to the flags/ directory and nginx is
configured, you can experiment with the ``flags*.py``examples without hitting
the network.
Unpack the initial data in the countries/ directory and verify that 194
directories are created in countries/flags/, each with a .gif and
a metadata.json file:
$ unzip flags.zip ... many lines omitted... creating: flags/zw/ inflating: flags/zw/metadata.json inflating: flags/zw/zw.gif $ ls flags | wc -w 194 $ find flags | grep .gif | wc -l 194 $ find flags | grep .json | wc -l 194 $ ls flags/ad ad.gif metadata.json
Download and install nginx. I used version 1.6.2 -- the latest stable version as I write this.
- Download page: http://nginx.org/en/download.html
- Beginner's guide: http://nginx.org/en/docs/beginners_guide.html
Edit the the nginx.conf file to set the port and document root.
You can determine which nginx.conf is in use by running:
$ nginx -V
The output starts with:
nginx version: nginx/1.6.2 built by clang 6.0 (clang-600.0.51) (based on LLVM 3.5svn) TLS SNI support enabled configure arguments:...
Among the configure arguments you'll see --conf-path=. That's the
file you will edit.
Most of the content in nginx.conf is within a block labeled http
and enclosed in curly braces. Within that block there can be multiple
blocks labeled server. Add another server block like this one:
server {
listen 8001;
location /flags/ {
root /full-path-to.../countries/;
}
}
After editing nginx.conf the server must be started (if it's not
running) or told to reload the configuration file:
$ nginx # to start, if necessary $ nginx -s reload # to reload the configuration
To test the configuration, open the URL below in a browser. You should see the blue, yellow and red flag of Andorra:
http://localhost:8001/flags/ad/ad.gif
If the test fails, please double check the procedure just described and refer to the nginx documentation.
At this point you may run the flags_*2.py examples against the nginx
install by changing the BASE_URL constant in flags_sequential2.py.
However, nginx is so fast that you will not see much difference in run
time between the sequential and the threaded versions, for example. For more
realistic testing with simulated network lag, we need vaurien.
**vaurien depends on gevent which is only available for Python 2.5-2.7. To
install vaurien I opened another shell, created another virtualenv for
Python 2.7, and used that environment to install and run vaurien:
$ virtualenv-2.7 .env27 --no-site-packages --distribute New python executable in .env27/bin/python Installing setuptools, pip...done. $ . .env27/bin/activate (.env27)$ pip install vaurien Downloading/unpacking vaurien Downloading vaurien-1.9.tar.gz (50kB): 50kB downloaded ...many lines and a few minutes later... Successfully installed vaurien cornice gevent statsd-client vaurienclient greenlet http-parser pyramid simplejson requests zope.interface translationstring PasteDeploy WebOb repoze.lru zope.deprecation venusian Cleaning up... (.env27)$
Using that same shell with the .env27 activated, run the vaurien_delay.sh script in the countries/ directory:
(.env27)$ $ ./vaurien_delay.sh 2015-02-25 20:20:17 [69124] [INFO] Starting the Chaos TCP Server 2015-02-25 20:20:17 [69124] [INFO] Options: 2015-02-25 20:20:17 [69124] [INFO] * proxies from localhost:8002 to localhost:8001 2015-02-25 20:20:17 [69124] [INFO] * timeout: 30 2015-02-25 20:20:17 [69124] [INFO] * stay_connected: 0 2015-02-25 20:20:17 [69124] [INFO] * pool_max_size: 100 2015-02-25 20:20:17 [69124] [INFO] * pool_timeout: 30 2015-02-25 20:20:17 [69124] [INFO] * async_mode: 1
The vaurien_delay.sh adds a 1s delay to every response.
There is also the vaurien_error_delay.sh script which produces errors in 25% of the responses and a .5 se delay to 50% of the responses.
Homebrew (copy & paste code at the bottom of http://brew.sh/):
$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" $ brew doctor $ brew install nginx
Download and unpack:
Docroot is: /usr/local/var/www /usr/local/etc/nginx/nginx.conf
- To have launchd start nginx at login:
- ln -sfv /usr/local/opt/nginx/*.plist ~/Library/LaunchAgents
- Then to load nginx now:
- launchctl load ~/Library/LaunchAgents/homebrew.mxcl.nginx.plist
- Or, if you don't want/need launchctl, you can just run:
- nginx
Docroot is: /usr/share/nginx/html