import numba import numpy as np import pylab as plt from time import time import blz from shutil import rmtree import csv from math import sqrt from collections import defaultdict import pandas @numba.njit def mandel(x, y, max_iters): """ Given the real and imaginary parts of a complex number, determine if it is a candidate for membership in the Mandelbrot set given a fixed number of iterations. """ c = complex(x, y) z = 0.0j for i in xrange(max_iters): z = z*z + c if (z.real*z.real + z.imag*z.imag) >= 4: return i return max_iters def create_fractal(height, width, min_x, max_x, min_y, max_y, image, row, iters): pixel_size_x = (max_x - min_x) / width pixel_size_y = (max_y - min_y) / height for x in xrange(height): imag = min_y + x * pixel_size_y for y in xrange(width): real = min_x + y * pixel_size_x color = mandel(real, imag, iters) row[y] = color image.append(row) height = 20000 width = 30000 #If the blz already exist, remove it rmtree('images/Mandelbrot.blz', ignore_errors=True) image = blz.zeros((0, width), rootdir='images/Mandelbrot.blz', dtype=np.uint8, expectedlen=height*width, bparams=blz.bparams(clevel=0)) row = np.zeros((width), dtype=np.uint8) t1 = time() create_fractal(height, width, -2.0, 1.0, -1.0, 1.0, image, row, 20) t2 = time() image.flush() print t2-t1 def copy(src, dest, clevel=5, shuffle=True, cname="blosclz"): """ Parameters ---------- clevel : int (0 <= clevel < 10) The compression level. shuffle : bool Whether the shuffle filter is active or not. cname : string ('blosclz', 'lz4hc', 'snappy', 'zlib', others?) Select the compressor to use inside Blosc. """ src = blz.barray(rootdir=src) img_copied = src.copy(rootdir=dest, bparams=blz.bparams(clevel=clevel, shuffle=shuffle, cname=cname), expectedlen=src.size) img_copied.flush() @numba.njit def mymean(src, p0, p1, y): factor = 1/(1. * p0 * p1) for i in range(y.shape[0]): for j in range(y.shape[1]): s = 0. for k in range(p0): for l in range(p1): s += src[(p0*i)+k, (p1*j)+l] * factor y[i, j] = s def downsample(orig, down_cell, cache_size=2**21): c0, c1 = down_cell #Let's calculate the matrix dimensions pixel_size = orig[0, 0].nbytes n = int(round(sqrt(cache_size/pixel_size), 0)) #How many complete matrices? hor = int(orig.shape[1]) / n ver = int(orig.shape[0]) / n #Complete matrix dimensions submatrix_n = round(n/float(c0)) submatrix_center_shape = (submatrix_n, submatrix_n) submatrix_center = np.empty(submatrix_center_shape, dtype=orig.dtype) #Bottom border matrix dimensions ver_px = round((int(orig.shape[0]) % n) / c0, 0) submatrix_bottom_shape = (ver_px, submatrix_n) submatrix_bottom = np.empty(submatrix_bottom_shape, dtype=orig.dtype) #Right border matrix dimensions hor_px = round((int(orig.shape[1]) % n) / c1, 0) submatrix_right_shape = (submatrix_n, hor_px) submatrix_right = np.empty(submatrix_right_shape, dtype=orig.dtype) #Corner matrix dimensions submatrix_corner_shape = (ver_px, hor_px) submatrix_corner = np.empty(submatrix_corner_shape, dtype=orig.dtype) #We build the final container final_shape = (submatrix_n * ver + ver_px, submatrix_n * hor + hor_px) final = np.empty(final_shape, orig.dtype) #Downsample the middle of the image for i in xrange(ver): for j in xrange(hor): #Get the optimal matrix submatrix = orig[i*n:(i+1)*n, j*n:(j+1)*n] mymean(submatrix, c0, c1, submatrix_center) final[i*submatrix_n:(i+1)*submatrix_n, j*submatrix_n:(j+1)*submatrix_n] = submatrix_center #Downsample the right border for i in range(ver): submatrix = orig[i*n:(i+1)*n, hor*n:] mymean(submatrix, c0, c1, submatrix_right) final[i * submatrix_n:(i+1)*submatrix_n, submatrix_n*hor:] = submatrix_right #Downsample the bottom border for j in range(hor): submatrix = orig[ver*n:, j*n:(j+1)*n] mymean(submatrix, c0, c1, submatrix_bottom) final[submatrix_n*ver:, j * submatrix_n:(j+1)*submatrix_n] = submatrix_bottom #Downsample the corner submatrix = orig[n*ver:, n*hor:] mymean(submatrix, c0, c1, submatrix_corner) final[submatrix_n*ver:, submatrix_n*hor:] = submatrix_corner return final def benchmark(src, cmethods): for method in cmethods: myfile = open('csv/' + method + '.csv', 'wb') wr = csv.writer(myfile, quoting=csv.QUOTE_ALL) wr.writerow(['Compression level', 'Compressed size', 'Compression ratio', 'Compression time', 'Downsampling time']) for compression_level in xrange(0,10): #I get the original image and compress it tc1 = time() copy(src, 'images/temp.blz', compression_level, False, method) tc2 = time() #Get disk size img = blz.barray(rootdir='images/temp.blz') disk_size = img.cbytes #Now I downsample it measuring the time t1 = time() downsample(img, (4,4)) t2 = time() #I should store the size of the file, compression method, shuffle and compression ratio row = [compression_level, disk_size, round(img.nbytes/float(disk_size),3), str(tc2 - tc1), str(t2 - t1)] #Add it to the csv wr.writerow(row) rmtree('images/temp.blz', ignore_errors=True) myfile.close() src = 'images/Mandelbrot.blz' cmethods = ['blosclz', 'lz4hc', 'snappy', 'zlib'] t1 = time() benchmark(src, cmethods) t2 = time() print t2-t1 def get_dict(filename): columns = defaultdict(list) with open('csv/' + filename) as f: reader = csv.reader(f) reader.next() for row in reader: for (i,v) in enumerate(row): if i == 1: columns[i].append(float(v)/131072) continue columns[i].append(v) return columns blosclz = get_dict('blosclz.csv') lz4hc = get_dict('lz4hc.csv') snappy = get_dict('snappy.csv') zlib = get_dict('zlib.csv') print 'blosclz data' df = pandas.read_csv('csv/blosclz.csv') df print 'lz4hc data' df = pandas.read_csv('csv/lz4hc.csv') df print 'snappy data' df = pandas.read_csv('csv/snappy.csv') df print 'zlib data' df = pandas.read_csv('csv/zlib.csv') df %matplotlib inline #Matplotlib magic fig = plt.figure() ax = fig.add_subplot(111) ax1 = fig.add_subplot(221) ax2 = fig.add_subplot(222) ax3 = fig.add_subplot(223) ax4 = fig.add_subplot(224) plt.subplots_adjust(wspace=0.6, hspace=0.6) ax.spines['top'].set_color('none') ax.spines['bottom'].set_color('none') ax.spines['left'].set_color('none') ax.spines['right'].set_color('none') ax.tick_params(labelcolor='w', top='off', bottom='off', left='off', right='off') plt.rcParams['figure.figsize'] = 10, 10 ax.set_xlabel('Compression ratio') ax.set_ylabel('Compression level') #blosclz ax1.set_title('blosclz') ax1.plot(blosclz[2][2:], blosclz[0][2:], 'bo-') #lz4hc ax2.set_title('lz4hc') ax2.plot(lz4hc[2][2:], lz4hc[0][2:], 'bo-') #snappy ax3.set_title('snappy') ax3.plot(snappy[2][2:], snappy[0][2:], 'bo-') #zlib ax4.set_title('zlib') ax4.plot(zlib[2][2:], zlib[0][2:], 'bo-') plt.show() #Matplotlib magic fig = plt.figure() ax = fig.add_subplot(111) ax1 = fig.add_subplot(221) ax2 = fig.add_subplot(222) ax3 = fig.add_subplot(223) ax4 = fig.add_subplot(224) plt.subplots_adjust(wspace=0.6, hspace=0.6) ax.spines['top'].set_color('none') ax.spines['bottom'].set_color('none') ax.spines['left'].set_color('none') ax.spines['right'].set_color('none') ax.tick_params(labelcolor='w', top='off', bottom='off', left='off', right='off') plt.rcParams['figure.figsize'] = 10, 10 ax.set_ylabel('Compression time (s)') ax.set_xlabel('Compression ratio') #blosclz ax1.set_title('blosclz') ax1.plot(blosclz[2][2:], blosclz[3][2:], 'bo-') #lz4hc ax2.set_title('lz4hc') ax2.plot(lz4hc[2][2:], lz4hc[3][2:], 'bo-') #snappy ax3.set_title('snappy') ax3.plot(snappy[2][2:], snappy[3][2:], 'bo-') #zlib ax4.set_title('zlib') ax4.plot(zlib[2][2:], zlib[3][2:], 'bo-') plt.show() #Matplotlib magic fig = plt.figure() ax = fig.add_subplot(111) ax1 = fig.add_subplot(221) ax2 = fig.add_subplot(222) ax3 = fig.add_subplot(223) ax4 = fig.add_subplot(224) plt.subplots_adjust(wspace=0.6, hspace=0.6) ax.spines['top'].set_color('none') ax.spines['bottom'].set_color('none') ax.spines['left'].set_color('none') ax.spines['right'].set_color('none') ax.tick_params(labelcolor='w', top='off', bottom='off', left='off', right='off') plt.rcParams['figure.figsize'] = 10, 10 ax.set_ylabel('Downsampling time (s)') ax.set_xlabel('Compression level') #Y value for red line Y = (float(blosclz[4][0]) + float(lz4hc[4][0]) + float(snappy[4][0]) + float(zlib[4][0]))/len(cemethods) #blosclz ax1.set_title('blosclz') ax1.plot(blosclz[0][2:], blosclz[4][2:], 'bo-') #Uncompressed point ax1.axhline(Y, color = 'r') #lz4hc ax2.set_title('lz4hc') ax2.plot(lz4hc[0][2:], lz4hc[4][2:], 'bo-') #Uncompressed point ax2.axhline(Y, color = 'r') #snappy ax3.set_title('snappy') ax3.plot(snappy[0][2:], snappy[4][2:], 'bo-') #Uncompressed point ax3.axhline(Y, color = 'r') #zlib ax4.set_title('zlib') ax4.plot(zlib[0][2:], zlib[4][2:], 'bo-') #Uncompressed point ax4.axhline(Y, color = 'r') plt.show()