In [1]:

```
from IPython.display import display
from IPython.display import HTML
```

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10 001st prime number?

Another prime question. Calculating 10,000. We did this before in question 3, time to reuse it..

**Method 1: brute force**

In [84]:

```
def isPrime(x):
if (x==1):
return False
for i in range(2,x):
if x%i==0:
return False
return True
def getPrimes(maxValue):
primes = []
for i in range(1,maxValue):
if isPrime(i):
primes.append(i)
return primes
primes = getPrimes(10000)
```

In [86]:

```
%%timeit
getPrimes(10000)
```

In [85]:

```
len(primes)
```

Out[85]:

The brute force solution solution takes more than a second to calculate primes up to 10000. And how many primes did that yield? Only 1229! This doesn't look like a reasonable way to calculate 10000 primes. Luckily, there is a very simple and clever algorithm that can do this job much faster.

**Method 2: Sieve of Eratosthenes**

The basic notion of the sieve of Eratosthenes is to pre-allocate a list of numbers up to n, and then, taking a prime (starting with 2), cross out every __multiple__ of that prime, as those multiples clearly can't be primes. The next prime is then the next unmarked value in the list. The process repeats until there are no more primes to be found.

In [123]:

```
def showState(l, p, nx):
numbers = ''
for n in l:
style=''
if n<0:
style+='text-decoration: line-through; background-color: rgb(171, 231, 255);'
if n==p:
style+='background-color: rgb(230,255,95);'
if n==nx:
style+='background-color: rgb(150, 233, 150);'
if n==0:
style+='background-color: rgb(220,220,220); color: rgb(220,220,220);'
numbers+='{1}'.format(style, abs(n))
s = """ {0}
```

""".format(numbers)
h = HTML(s)
display(h)
def sieve(size, showStates=True):
l = list(range(2,size+1)) #generate the candidate set
idx = lambda x: x-2 #just a simple mapping from number in list to list index
p = 2 #seed with initial prime
for iteration in range(len(l)):
#mark every multiple of p
for i in range(p*2, size+1, p):
l[idx(i)] = -i
#find the next unmarked value, that's the next p
nextPrime = 0
for i in l[idx(p+1):]:
if i>0:
nextPrime = i
break
if (showStates):
showState(l, p, nextPrime)
for i in range(p*2, size+1, p):
l[idx(i)] = 0
p = nextPrime
#if we haven't found any unmarked values, we're done
if p == 0:
break
#return all unmarked values
return filter(lambda x: x>0, l)
sieve(38, True)

Out[123]:

Above is the state of the preallocated list at each iteration of sifting primes up to 38.

Starting with a fully unmarked list, and the first prime, 2 (shown in yellow), every multiple of 2 is marked off in the list (shown in blue). The next prime (green) is then found by moving up the list until the first unmarked number.

The next iteration starts at the newly found prime, 3, and proceeds to mark off every multiple of 3 in the list, and so forth.

Finally, the last iteration attempts to find unmarked values to the right of 37 and finds none. At that point the algorithm can terminate and return the remaining unmarked values in the list.

In [58]:

```
%%timeit
v = sieve(10000, False)
```

In [72]:

```
len(sieve(10000, False))
```

Out[72]:

At less than 60ms to find all primes less than 10000, this algorithm is orders of magnitude faster.

It can be further optimized by recognizing that if one divisor or factor of a number (other than a perfect square) is greater than its square root, then the other factor will be less than its square root. Hence __all multiples of primes greater than the square root of n need not be considered__^{[1]}. The sieve function can be trivially modified to use this knowledge by limiting the marking phase to \(\sqrt{n}\)

^{[1]} http://britton.disted.camosun.bc.ca/jberatosthenes.htm

In [73]:

```
#comments removed for brevity
def sieve(size, showStates=True):
l = list(range(2,size+1))
idx = lambda x: x-2
p = 2
for iteration in range(int(0.5+len(l)**0.5)):
#mark every multiple of p up to sqrt(n)
for i in range(p*2, size+1, p):
l[idx(i)] = -i
nextPrime = 0
for i in l[idx(p+1):]:
if i>0:
nextPrime = i
break
if (showStates):
showState(l, p, nextPrime)
for i in range(p*2, size+1, p):
l[idx(i)] = 0
p = nextPrime
if p == 0:
break
return filter(lambda x: x>0, l)
```

In [74]:

```
%%timeit
v = sieve(10000, False)
```

So the Eratosthenes sieve is very fast at finding primes up to some limit m. At m=10000, we find n=1229. What range do we have to sieve to actually get our n=1000 primes?

Rosser's theorem^{[2]} provides a useful inequality that establishes bounds on the value of the n^{th} prime number:

\(\ln n + \ln\ln n - 1 < \frac{p_n}{n} < \ln n + \ln \ln n \quad\text{for } n \ge 6\)

^{[2]} http://en.wikipedia.org/wiki/Prime_number_theorem#Approximations_for_the_nth_prime_number

In [114]:

```
def maxPrime(n):
return int(0.5+(float(n)*log(n)+ n*log(log(n))))
limit = maxPrime(10000)
print('The 10000th prime has a value < {0}'.format(limit))
```

In [115]:

```
primes = sieve(limit, False)
len(primes)
```

Out[115]:

The upper bound function appears to have done it's job and netted just over 10000 primes. We can now obtain the 10001^{st}

In [118]:

```
primes[10000]
```

Out[118]:

The four adjacent digits in the 1000-digit number that have the greatest product are 9 × 9 × 8 × 9 = 5832.

73167176531330624919225119674426574742355349194934 96983520312774506326239578318016984801869478851843 85861560789112949495459501737958331952853208805511 12540698747158523863050715693290963295227443043557 66896648950445244523161731856403098711121722383113 62229893423380308135336276614282806444486645238749 30358907296290491560440772390713810515859307960866 70172427121883998797908792274921901699720888093776 65727333001053367881220235421809751254540594752243 52584907711670556013604839586446706324415722155397 53697817977846174064955149290862569321978468622482 83972241375657056057490261407972968652414535100474 82166370484403199890008895243450658541227588666881 16427171479924442928230863465674813919123162824586 17866458359124566529476545682848912883142607690042 24219022671055626321111109370544217506941658960408 07198403850962455444362981230987879927244284909188 84580156166097919133875499200524063689912560717606 05886116467109405077541002256983155200055935729725 71636269561882670428252483600823257530420752963450

Find the thirteen adjacent digits in the 1000-digit number that have the greatest product. What is the value of this product?

In [121]:

```
source = '''
73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450
'''.replace('\n','')
#break the source string into a series of 13 character long slices at every possible position
window_size = 13
slices = [source[x:x+window_size] for x in range(len(source) - window_size + 1)]
#compute the product of each slice
products = [product(map(int, row), dtype='int64') for row in slices]
max(products)
```

Out[121]:

2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.

What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?

This is an interesting problem!

First thing's first, we can establish that the largest positive number that meets the condition is \(1×2×3..×20\) or simply \(20!\) We can work our way down by repeatedly dividing this upper boundary number by any number in the range [1,20] and seeing if it's an even division.

This approach results in a runtime complexity of O(log(n!)), better known as O(n log n)

In [16]:

```
factors = 20
upper = math.factorial(factors)
divisors = range(2, factors+1)
current = upper
#repeatedly attempt to divide current number by prime factors ordered
#from largest to smallest as long as the result has a remainder of 0
while True:
found = False
for p in reversed(divisors):
c = current / p
if c % p == 0:
found = True
current = c
break
if not found:
break
print 'divided by', p, 'got', current
```

The sum of the squares of the first ten natural numbers is, 1^{2} + 2^{2} + ... + 10^{2} = 385

The square of the sum of the first ten natural numbers is, (1 + 2 + ... + 10)^{2} = 552 = 3025

Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 − 385 = 2640.

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

**Method 1: brute force**

Complexity: O(N)

In [10]:

```
def squareDiff(x):
s = range(1, x+1)
sumSquares = sum([x*x for x in s])
squareSum = math.pow(sum(s),2)
diff = squareSum - sumSquares
return diff
squareDiff(100)
```

Out[10]:

Easy enough, however it's well known that the sum of a series of natural numbers up to n can be calculated as \(\frac{n(n+1)}{2}\)

Is it possible that the sum of a series of natural numbers squared up to n can be calculated in constant time as well? I didn't know the answer and cheated by using a genetic algorithm to attempt to fit an equation to match *sumSquares* for the first 140 inputs.

Amazingly, it came back with a polynomial that had 0 residual error: \(\frac{1}{6}n + \frac{1}{2}n^2 + \frac{1}{3}n^3\)

Let's plot this polynomial to double check

In [11]:

```
brute = lambda n: sum([x*x for x in xrange(1,n+1)])
poly = lambda n: round(1./6 * n + 1./2 * pow(n, 2) + 1./3 * pow(n, 3))
x = np.array(range(1,2200))
brute_y = np.array([brute(t) for t in x])
poly_y = np.array([poly(t) for t in x])
plt.plot(x, brute_y, label='Brute Force', color='blue')
plt.plot(x, poly_y, label='Polynomial', color='red')
plt.legend()
print 'max error:', max(brute_y - poly_y)
```

Looks like the polynomial solution suffers from integer overflow at around n$\approx$1300; earlier than the brute force solution. This is understandable considering the polynomial solution deals with n^{3} while brute force only deals with n^{2}. We'll switch to floats to overcome overflow issues in both cases.

In [12]:

```
brute = lambda n: sum([1.*x*x for x in xrange(1,n+1)])
poly = lambda n: round(1./6 * n + 1./2 * pow(n, 2.) + 1./3 * pow(n, 3.))
x = np.array(range(1,2200))
brute_y = np.array([brute(t) for t in x])
poly_y = np.array([poly(t) for t in x])
plt.plot(x, brute_y, label='Brute Force', color='blue')
plt.plot(x, poly_y, label='Polynomial', color='red')
plt.legend()
print 'max error:', max(brute_y - poly_y)
```

A maximum error of 0 across a small input set is promising, however I got in touch with a friend to check, and he promtly came back with a proof!

All credit to **Jonah Schreiber** for the below proof:

\[\sum_{x=1}^{n}{x^2}=\frac{n^3}{3}+\frac{n^2}{2}+\frac{n}{6}\]

**Base**

Here, we show that the formula is correct for \(n=1\). On the left-hand side, we have \(1\), and on the right-hand side, we have \(\frac{1^3}{3}+\frac{1^2}{2}+\frac{1}{6}=1\), so the base case is true.

**Assumption**

Assume that

\[\sum_{x=1}^{n}{x^2}=\frac{n^3}{3}+\frac{n^2}{2}+\frac{n}{6}\]

is true.

**Induction**

Show then that it is true for \(n+1\), that is,

\[\sum_{x=1}^{n+1}{x^2}=\frac{(n+1)^3}{3}+\frac{(n+1)^2}{2}+\frac{n+1}{6}\]

Let us break out the last term on the left-hand side, and expand the right-hand side:

\[\sum_{x=1}^{n}{x^2}+(n+1)^2=\frac{n^3+3n^2+3n+1}{3}+\frac{n^2+2n+1}{2}+\frac{n+1}{6}\]

We already know the sum on the left-hand side, which we insert.

\[\frac{n^3}{3}+\frac{n^2}{2}+\frac{n}{6}+n^2+2n+1=\frac{n^3+3n^2+3n+1}{3}+\frac{n^2+2n+1}{2}+\frac{n+1}{6}\]

Collecting terms, we get

\[\frac{n^3}{3}+\frac{3n^2}{2}+\frac{13n}{6}+1=\frac{n^3}{3}+\frac{3n^2}{2}+\frac{13n}{6}+1\]

The two sides are equal, so the formula is correct.

Thanks again to Jonah Schreiber for the above proof.

See http://www.trans4mind.com/personal_development/mathematics/series/sumNaturalSquares.htm for several derivations of this formula.

And so, finally,

**Method 2: Arithmetic **

Complexity: O(1)

In [13]:

```
def squareDiff2(x):
sumSquares = round(1./6 * x + 1./2 * pow(x, 2.) + 1./3 * pow(x, 3.))
squareSum = pow(x*(x+1)/2., 2)
return squareSum - sumSquares
squareDiff2(100)
```

Out[13]:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

In [1]:

```
#trivial with python's comprehensions
sum(x for x in xrange(1,1000) if x%5==0 or x%3==0)
```

Out[1]:

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

**Method 1: memoization**

In [2]:

```
#set base case
fibs = {0:0, 1:1}
def fib(n):
ret = fibs.get(n, None)
if ret != None:
return ret
ret = fib(n-2) + fib(n-1)
fibs[n] = ret
return ret
def getEvenSum(upperBound = 4000000):
evenValued = []
for i in range(upperBound):
v = fib(i)
if v > upperBound:
break
if v%2 == 0:
evenValued.append(v)
return sum(evenValued)
getEvenSum()
```

Out[2]:

In [3]:

```
%%timeit
getEvenSum()
```

This isn't terrible, but we can do a lot better by iterating from the bottom up.

**Method 2: iterative**

In [4]:

```
def getEvenSum(upperBound = 4000000):
previous = 0
total = 1
even = 0
while total < upperBound:
temp = total
total += previous
if total %2 == 0:
even += total
previous = temp
return even
getEvenSum()
```

Out[4]:

In [5]:

```
%%timeit
getEvenSum()
```

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

This is an uninspired brute force solution, see problem 6 for a better way to find primes

In [6]:

```
def isPrime(x):
if (x==1):
return False
for i in range(2,x):
if x%i==0:
return False
return True
#build a list of all primes <=20000
primes = []
for i in range(1,20000):
if isPrime(i):
primes.append(i)
primes[-3:]
```

Out[6]:

In [7]:

```
#find the complete prime factorisation for the target number
target = 600851475143
factors = []
for p in reversed(primes):
if target % p == 0:
factors.append(p)
factors
```

Out[7]:

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.

Find the largest palindrome made from the product of two 3-digit numbers.

In [8]:

```
#nothing clever here, brute force over the search space and pick out all the palindromes
palindromes = []
def isPalindrome(x):
s = str(x)
return s == s[::-1]
#note the inner loop doesn't cover the entire search space but starts at x
#this halves work by not calculating redundant products like (1*2, 2*1)
for x in reversed(range(100, 1000)):
for y in reversed(range(x, 1000)):
n = x*y
if (isPalindrome(n)):
palindromes.append(n)
max(palindromes)
```

Out[8]:

I was hoping that I could replace my laptop with the surface for a super mobile setup that can move between work and home. The surface pro 2 certainly has enough horsepower and ram to get serious work done, and with a full size keyboard and dual screens at work and home, I shouldn’t have to deal with the tiny keyboard and onboard screen too much. In return I’d get a bonus tablet and a full fledged core i5 pc in a tiny form factor, a pretty sweet deal.

After spending a week with the surface, I have to say that it didn’t work as well as I hoped. The following are the highlights of my experience.

**The type cover is not good enough**

The increased key travel on the type cover 2 is a welcome improvement, but it comes with a step backwards on the track-pad which used to have physical clickable buttons and a rubbery surface. These have been replaced with capacitive buttons and a felt finish, which is starting to fade in high traffic areas after a mere week of use. The bottom line is that despite the improvements, it’s still just too small a keyboard to do serious work with. I got the full fledged pc experience I wanted at home and work, but without a keyboard attached the usefulness of the surface is severely crippled.

The biggest redeeming factor here is the active digitizer pen, which is amazing. While it doesn’t replace a mouse, its a great supplement and feels very natural to use. The touch screen is just no comparison to the fine control you get with the pen, and the active digitizer means you can do things equivalent to mouse movements without clicking by hovering the pen over the screen. I actually miss having the pen when using other computers now, I hope to see a laptop come with this feature in the future.

**The pixel density on the screen is too high, kinda**

The show-stopping problem to me stemmed from the small screen, but not for the obvious reason. I was quite aware that a 10″ screen isn’t much to do real work on, and prepared to supplement it with dual monitors. What I wasn’t prepared for is how terrible DPI scaling is on windows.

Here’s the problem: the screen is a full hd 1920×1080 panel packed into a mere 10 inches. Applications not specifically designed to deal with high dpi displays render tiny text and tiny buttons which are very difficult to read and impossible to click on with your finger. Windows alleviates this with a dpi scaling option which forces applications to increase the size of what they’re rendering. Unfortunately unless applications were written to deal with this, it looks like they just get upscaled with what looks to be a bilinear filter. The result is that everything is blurry! This affects just about every application I’ve used except for internet explorer and visual studio. Even chrome and firefox don’t support dpi scaling and can at best be hacked by being run in compatibility mode to prevent scaling, followed by increasing page zoom or default font size. This tends to mess up some site layouts and still leaves you with tiny unreadable tabs and other native UI components.

Here’s IE vs what you get with chrome out of the box:

And here’s opera hacked to work sort-of okay vs chrome out of the box. Note the tiny tabs and broken visuals on opera.

I was also surprised to find that .net winforms applications that I’m developing have the same scaling problems. I assumed that Microsoft would definitely make sure that apps built with visual studio are ready to run on the surface out of the box. As a developer, I had no idea about this being a problem until I was on the receiving end, and I suspect that that’s the case with a lot of applications out there.

The problem is made even worse by the fact that the dpi scaling setting is global across all monitors in a multi-monitor setup. This means that when I plug the surface in at work, I either get giant scaled graphics on the large screens, or tiny unreadable graphics on the surface. On top of that, the hacky application setups to prevent blurring that I described above, are also carried over to the large screens. This just doesn’t work at all.

**It’s not a replacement for the iPad**

After a week of use its pretty clear to me that the surface pro is not really a tablet. I own an iPad and I continued to prefer it as my tablet both hardware and software wise. The aspect ratio on the surface doesn’t lend itself very well to the tablet experience. The browser is worse than on the iPad, and IE is the only browser that works ok in a tablet fashion. I found that I actually preferred IE over chrome which is pretty depressing.

I also encountered a pretty serious issue where the surface would randomly refuse to wake from sleep sometimes and reboot instead. Basically every time I closed the lid, I risked losing all my unsaved work. Microsoft’s tech support walked me through all the scripts that covered anything related to this issue, including a full factory reset, but nothing helped. I do know that other users have reported the same problem and suspect it’s caused by some application that I installed. Sadly I only installed the bare basics for work such as visual studio, vmware, sublime text, and office, so it looks like another show stopping problem.

You can forget about using any desktop applications in portrait orientation or without the type cover, the experience is just painful. Also, every time you go to portrait mode, your desktop icons are rearranged to fit horizontally and don’t go back when you return to landscape.

Last but not least, its just too thick and heavy to comfortably hold as a tablet. This is excusable if you account for the fact that you’re actually holding a high end laptop worth of horsepower but doesn’t change the fact that it’s a poor tablet experience.

**Wrapping it up**

All in all, the surface pro is an incredible piece of hardware at a really good price point, and I really want to like it, but in the end it can’t replace my laptop and it can’t replace my iPad. I would love to own one in addition to a laptop+tablet, but I just can’t justify 1500 dollars on a device that doesn’t have a clear purpose. I might reconsider in the future when high dpi screens become prevalent and application developers are forced to support them.

Despite the issues I’ve had, I’m going to be sad to part with the surface. It looks and feels amazing and I imagine it’s a great device for lighter work. I was also surprised that despite my strong dislike for windows 8 based on previous experience, after a week I’m not only used to it but actually prefer it in many ways. I thought the first thing I’ll be doing is reverting the start menu back to 7, but you know what? The windows 8 start is actually really good if you give it a chance, and doubles as a solid replacement for Launchy.

**Follow-up**

After a few more days of taking it to work, I ended up returning my surface for a refund. My general experience was that the small screen and the type cover simply weren’t good enough for prolonged serious work. In particular, there wasn’t enough screen real-estate to have a decent Visual Studio workspace and I found myself constantly trying to find balance between making the text too small or not being able to see enough code at once.

This was compounded by the fact that the dual screen experience was terrible and, my original notion of coming to work/home and docking the surface for serious work was simply not viable due to the hidpi scaling issues mentioned above. This was the selling point of the surface for me, and it simply didn’t deliver.

**Instead, I picked up the Lenovo Yoga 2 Pro and couldn’t be happier**

Despite being larger, I think this laptop is on equal footing in terms of mobility; I feel that it’s actually better because you can comfortably plop it in your lap – which is rather difficult with the surface + type cover, and you can stand it up on any angle rather than the surface’s 2 predefined kickstand modes.

In exchange for the higher price tag and larger form factor, you get a real keyboard, a good trackpad, 2 USB ports, and a screen that’s large enough to comfortably use busy tools like VS for extended periods of time. The dual screen experience is just as bad as with the surface, but at least the Yoga 2 is a perfectly usable development machine on its own.

Since the time of the original post, hidpi software adoption has also made great progress, and you can now expect a lot of tools to work out of the box. Two notable exceptions are Adobe Photoshop which is completely unusable, and Remote Desktop which doesn’t support scaling of the remote display. An alternative to the latter is Remote Desktop Connection Manager which is free and supports scaling.

]]>def rstyle(ax):

"""Styles an axes to appear like ggplot2

Must be called after all plot and axis manipulation operations have been carried out (needs to know final tick spacing)

"""

#set the style of the major and minor grid lines, filled blocks

ax.grid(True, 'major', color='w', linestyle='-', linewidth=1.4)

ax.grid(True, 'minor', color='0.92', linestyle='-', linewidth=0.7)

ax.patch.set_facecolor('0.85')

ax.set_axisbelow(True)

#set minor tick spacing to 1/2 of the major ticks

ax.xaxis.set_minor_locator(MultipleLocator( (plt.xticks()[0][1]-plt.xticks()[0][0]) / 2.0 ))

ax.yaxis.set_minor_locator(MultipleLocator( (plt.yticks()[0][1]-plt.yticks()[0][0]) / 2.0 ))

#remove axis border

for child in ax.get_children():

if isinstance(child, matplotlib.spines.Spine):

child.set_alpha(0)

#restyle the tick lines

for line in ax.get_xticklines() + ax.get_yticklines():

line.set_markersize(5)

line.set_color("gray")

line.set_markeredgewidth(1.4)

#remove the minor tick lines

for line in ax.xaxis.get_ticklines(minor=True) + ax.yaxis.get_ticklines(minor=True):

line.set_markersize(0)

#only show bottom left ticks, pointing out of axis

rcParams['xtick.direction'] = 'out'

rcParams['ytick.direction'] = 'out'

ax.xaxis.set_ticks_position('bottom')

ax.yaxis.set_ticks_position('left')

if ax.legend_ <> None:

lg = ax.legend_

lg.get_frame().set_linewidth(0)

lg.get_frame().set_alpha(0.5)

def rhist(ax, data, **keywords):

"""Creates a histogram with default style parameters to look like ggplot2

Is equivalent to calling ax.hist and accepts the same keyword parameters.

If style parameters are explicitly defined, they will not be overwritten

"""

defaults = {

'facecolor' : '0.3',

'edgecolor' : '0.28',

'linewidth' : '1',

'bins' : 100

}

for k, v in defaults.items():

if k not in keywords: keywords[k] = v

return ax.hist(data, **keywords)

def rbox(ax, data, **keywords):

"""Creates a ggplot2 style boxplot, is eqivalent to calling ax.boxplot with the following additions:

Keyword arguments:

colors -- array-like collection of colours for box fills

names -- array-like collection of box names which are passed on as tick labels

"""

hasColors = 'colors' in keywords

if hasColors:

colors = keywords['colors']

keywords.pop('colors')

if 'names' in keywords:

ax.tickNames = plt.setp(ax, xticklabels=keywords['names'] )

keywords.pop('names')

bp = ax.boxplot(data, **keywords)

pylab.setp(bp['boxes'], color='black')

pylab.setp(bp['whiskers'], color='black', linestyle = 'solid')

pylab.setp(bp['fliers'], color='black', alpha = 0.9, marker= 'o', markersize = 3)

pylab.setp(bp['medians'], color='black')

numBoxes = len(data)

for i in range(numBoxes):

box = bp['boxes'][i]

boxX = []

boxY = []

for j in range(5):

boxX.append(box.get_xdata()[j])

boxY.append(box.get_ydata()[j])

boxCoords = zip(boxX,boxY)

if hasColors:

boxPolygon = Polygon(boxCoords, facecolor = colors[i % len(colors)])

else:

boxPolygon = Polygon(boxCoords, facecolor = '0.95')

ax.add_patch(boxPolygon)

return bp

"""Styles an axes to appear like ggplot2

Must be called after all plot and axis manipulation operations have been carried out (needs to know final tick spacing)

"""

#set the style of the major and minor grid lines, filled blocks

ax.grid(True, 'major', color='w', linestyle='-', linewidth=1.4)

ax.grid(True, 'minor', color='0.92', linestyle='-', linewidth=0.7)

ax.patch.set_facecolor('0.85')

ax.set_axisbelow(True)

#set minor tick spacing to 1/2 of the major ticks

ax.xaxis.set_minor_locator(MultipleLocator( (plt.xticks()[0][1]-plt.xticks()[0][0]) / 2.0 ))

ax.yaxis.set_minor_locator(MultipleLocator( (plt.yticks()[0][1]-plt.yticks()[0][0]) / 2.0 ))

#remove axis border

for child in ax.get_children():

if isinstance(child, matplotlib.spines.Spine):

child.set_alpha(0)

#restyle the tick lines

for line in ax.get_xticklines() + ax.get_yticklines():

line.set_markersize(5)

line.set_color("gray")

line.set_markeredgewidth(1.4)

#remove the minor tick lines

for line in ax.xaxis.get_ticklines(minor=True) + ax.yaxis.get_ticklines(minor=True):

line.set_markersize(0)

#only show bottom left ticks, pointing out of axis

rcParams['xtick.direction'] = 'out'

rcParams['ytick.direction'] = 'out'

ax.xaxis.set_ticks_position('bottom')

ax.yaxis.set_ticks_position('left')

if ax.legend_ <> None:

lg = ax.legend_

lg.get_frame().set_linewidth(0)

lg.get_frame().set_alpha(0.5)

def rhist(ax, data, **keywords):

"""Creates a histogram with default style parameters to look like ggplot2

Is equivalent to calling ax.hist and accepts the same keyword parameters.

If style parameters are explicitly defined, they will not be overwritten

"""

defaults = {

'facecolor' : '0.3',

'edgecolor' : '0.28',

'linewidth' : '1',

'bins' : 100

}

for k, v in defaults.items():

if k not in keywords: keywords[k] = v

return ax.hist(data, **keywords)

def rbox(ax, data, **keywords):

"""Creates a ggplot2 style boxplot, is eqivalent to calling ax.boxplot with the following additions:

Keyword arguments:

colors -- array-like collection of colours for box fills

names -- array-like collection of box names which are passed on as tick labels

"""

hasColors = 'colors' in keywords

if hasColors:

colors = keywords['colors']

keywords.pop('colors')

if 'names' in keywords:

ax.tickNames = plt.setp(ax, xticklabels=keywords['names'] )

keywords.pop('names')

bp = ax.boxplot(data, **keywords)

pylab.setp(bp['boxes'], color='black')

pylab.setp(bp['whiskers'], color='black', linestyle = 'solid')

pylab.setp(bp['fliers'], color='black', alpha = 0.9, marker= 'o', markersize = 3)

pylab.setp(bp['medians'], color='black')

numBoxes = len(data)

for i in range(numBoxes):

box = bp['boxes'][i]

boxX = []

boxY = []

for j in range(5):

boxX.append(box.get_xdata()[j])

boxY.append(box.get_ydata()[j])

boxCoords = zip(boxX,boxY)

if hasColors:

boxPolygon = Polygon(boxCoords, facecolor = colors[i % len(colors)])

else:

boxPolygon = Polygon(boxCoords, facecolor = '0.95')

ax.add_patch(boxPolygon)

return bp

Usage is very simple, call rstyle(axes) just before showing or saving your figure. It is key to call it after all drawing and axis manipulation has been done, because it will be reading the major tick positions to work out where to put the minors.

from pylab import *

import scipy.stats

t = arange(0.0, 100.0, 0.1)

s = sin(0.1*pi*t)*exp(-t*0.01)

fig = plt.figure()

ax = fig.add_subplot(111)

plot(t,s, label = "Original")

plot(t,s*2, label = "Doubled")

ax.legend()

rstyle(ax)

plt.show()

import scipy.stats

t = arange(0.0, 100.0, 0.1)

s = sin(0.1*pi*t)*exp(-t*0.01)

fig = plt.figure()

ax = fig.add_subplot(111)

plot(t,s, label = "Original")

plot(t,s*2, label = "Doubled")

ax.legend()

rstyle(ax)

plt.show()

I have also included a function that creates a ggplot style histogram for you. This is nothing more than setting some default parameters to the hist function.

from pylab import *

import scipy.stats

t = arange(0.0, 100.0, 0.1)

s = sin(0.1*pi*t)*exp(-t*0.01)

fig = plt.figure()

ax = fig.add_subplot(111)

data = scipy.stats.norm.rvs(size = 1000)

rhist(ax, data, label = "Histogram")

ax.legend()

rstyle(ax)

plt.show()

import scipy.stats

t = arange(0.0, 100.0, 0.1)

s = sin(0.1*pi*t)*exp(-t*0.01)

fig = plt.figure()

ax = fig.add_subplot(111)

data = scipy.stats.norm.rvs(size = 1000)

rhist(ax, data, label = "Histogram")

ax.legend()

rstyle(ax)

plt.show()

There is also a slightly more involved boxplot function which handles fill colours and names for you.

from pylab import *

import scipy.stats

data = [scipy.stats.norm.rvs(size = 100), scipy.stats.norm.rvs(size = 100), scipy.stats.norm.rvs(size = 100)]

fig = plt.figure()

ax = fig.add_subplot(111)

ax.legend()

rbox(ax, data, names = ("One", "Two", "Three"), colors = ('white', 'cyan'))

rstyle(ax)

import scipy.stats

data = [scipy.stats.norm.rvs(size = 100), scipy.stats.norm.rvs(size = 100), scipy.stats.norm.rvs(size = 100)]

fig = plt.figure()

ax = fig.add_subplot(111)

ax.legend()

rbox(ax, data, names = ("One", "Two", "Three"), colors = ('white', 'cyan'))

rstyle(ax)

Finally, with a bit of help from Justin Peel over at StackOverflow, you can get some really nice graphics going that you won’t be ashamed to put in your published material or presentation.

I have only used these scripts in my fairly limited scenario and there are several obvious things such as the requirement to pass an axes, the enforcement of minor ticks at 1/2 majors, and the fact that I haven’t really done much with the legend, but it should be enough to get you started in your projects. Happy visualizating!

]]>Haven’t heard of Sublime Text 2? Well, I guess you’ve been living under a rock. Oh… you have? Well, in that case – why aren’t you using it yet?

Somewhere between a text editor and a IDE, Sublime Text 2 has been somewhat of an eye-opener into what text editors can really be.

Really, it can make a little nerd’s heart race just seeing this program in action. More than being pretty, it’s been a huge increase in productivity for me, and every day I find something new that I can either download as a plugin, or it’ll be released by the developers at a rather intense pace.

Not convinced? Well, this is why *I *use it. Hopefully I can make you use it too. First of all, it’s extremely fast. It’s extremely flexible. It’s also sort of free… well, more like an unlimited trial with a ‘buy me now’ nag every 10 or so saves. It’s also cross-platform!

I recommend buying it for the paltry $59 though, and this is why:

*It’s fast:*

Loading the program itself is practically immediate, depending on your system. It’ll open up your last project in little more than a heartbeat. Clicking open files shows the source code immediately (even for large amounts of text), and allows you to quickly glance over large amounts of code.

Not only is it fast in the most literal of sense… getting places is really quick too. Seriously, check these keyboard shortcuts out:

CTRL + P: Quickly select files from your open project using a very quick fuzzy match searching algorithm. You can type the filename or start writing the file path… it’ll find it and immediately preview the selected file as you scroll through your search.

*CTRL + Shift + P: * Run different commands from your installed addons, change themes, insert snippets… this is your go to command for this sort of thing.

*CTRL + Shift + F: *Quick find, search all open documents and list the files and an excerpt almost instantly. Double click the excerpt or the filename and you’ve found your chosen piece of code.

CTRL + G: Goto line number.

*It’s flexible:*

More to the point, you’ll want to install the Sublime Package Control plugin. With this, you can search and install plugins from a rather large list, often coming straight from the github/bitbucket repositories. The addon system alone is a reason you’ll want Sublime Text 2. From installing new themes, to adding missing or useful features, to just making the development process that little bit nicer: you want this.

As a django/python developer, I recommend a few apps:

Djaneiro: Your one stop for django syntax and code highlighting for django templates. If you’re a django developer, get this.

SublimeRope: Code completion – and a nifty ‘go to definition’ feature… though not without some initial setup. I’ll briefly tell you how to set rope up to actually work… because it doesn’t unless you do the following:

Press CTRL + Shift + P and type “rope”, and find “New Project”. At the bottom it will ask you for project details. Press enter for the first one, and you may have success with the next if you use a virtualenv (I always get errors). If you do too, continue on:

Go to your new .ropeproject folder in your project and edit the config.py. Find the section that looks like this:

# You can extend python path for looking up modules #prefs.add('python_path', '~/python/')

Here you can add the sources to your site-packages and your current open project, e.g.:

prefs.add('python_path', '/path/to/site-packages') prefs.add('python_path', '/path/to/your/project')

SideBarEnhancements: Extra stuff for your side bar on right click. Don’t have a sidebar? Go to view->Side bar. Easy.

BufferScroll: Remember and restore all open files, bookmarks, folds, etc. when you open the editor.

These are at least the basics, there are so many more plugins that you could use that will make your life easier.

At least go out and try it, you can find the editor at http://www.sublimetext.com/2

]]>