Abu Mostafa’s Machine Learning MOOC – Now on EdX

September 24, 2016September 24, 2016 Anirudh Technical Abu Mostafa, Algorithms, Andrew Ng, CalTech, Data Science, edX, Machine Learning, MOOC, Statistical Learning

This was in the pipeline for quite some time now. I have been waiting for his lectures on a platform such as EdX or Coursera, and the day has arrived. You can enroll and start with week 1’s lectures as they’re live now.

This course is taught by none other than Dr. Yaser S. Abu – Mostafa, whose textbook on machine learning, Learning from Data is #1 bestseller textbook (Amazon) in all categories of Computer Science. His online course has been offered earlier over here.

Teaching

Dr. Abu-Mostafa received the Clauser Prize for the most original doctoral thesis at Caltech. He received the ASCIT Teaching Awards in 1986, 1989 and 1991, the GSC Teaching Awards in 1995 and 2002, and the Richard P. Feynman prize for excellence in teaching in 1996.

Live ‘One-take’ Recordings

The lectures have been recorded from a live broadcast (including Q&A, which will let you gauge the level of CalTech students taking this course). In fact, it almost seems as though Abu Mostafa takes a direct jab at Andrew Ng’s popular Coursera MOOC by stating the obvious on his course page.

A real Caltech course, not a watered-down version

Again, while enrolling note that this is what Abu Mostafa had to say about the online course: “A Caltech course does not cater to short attention spans, and it may not provide instant gratification…[like] many MOOCs out there that are quite simple and have a ‘video game’ feel to them.” Unsurprisingly, many online students have dropped out in the past, but some of those students who “complained early on but decided to stick with the course had very flattering words to say at the end”.

Prerequisites

Basic probability
Basic matrices
Basic calculus
Some programming language/platform (I choose Python!)

If you’re looking for a challenging machine learning course, this is probably one you must take.

Implementing Undirected Graphs in Python

July 28, 2016July 28, 2016 Anirudh Technical Adjacency List, Adjacency Matrix, Algorithms, Code Snippets, example, Graphs, Math, Python

There are 2 popular ways of representing an undirected graph.

Adjacency List
Each list describes the set of neighbors of a vertex in the graph.

Adjacency Matrix
The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph.

Here’s an implementation of the above in Python:

	class Vertex:
	def __init__(self, vertex):
	self.name = vertex
	self.neighbors = []

	def add_neighbor(self, neighbor):
	if isinstance(neighbor, Vertex):
	if neighbor.name not in self.neighbors:
	self.neighbors.append(neighbor.name)
	neighbor.neighbors.append(self.name)
	self.neighbors = sorted(self.neighbors)
	neighbor.neighbors = sorted(neighbor.neighbors)
	else:
	return False

	def add_neighbors(self, neighbors):
	for neighbor in neighbors:
	if isinstance(neighbor, Vertex):
	if neighbor.name not in self.neighbors:
	self.neighbors.append(neighbor.name)
	neighbor.neighbors.append(self.name)
	self.neighbors = sorted(self.neighbors)
	neighbor.neighbors = sorted(neighbor.neighbors)
	else:
	return False

	def __repr__(self):
	return str(self.neighbors)

	class Graph:
	def __init__(self):
	self.vertices = {}

	def add_vertex(self, vertex):
	if isinstance(vertex, Vertex):
	self.vertices[vertex.name] = vertex.neighbors


	def add_vertices(self, vertices):
	for vertex in vertices:
	if isinstance(vertex, Vertex):
	self.vertices[vertex.name] = vertex.neighbors

	def add_edge(self, vertex_from, vertex_to):
	if isinstance(vertex_from, Vertex) and isinstance(vertex_to, Vertex):
	vertex_from.add_neighbor(vertex_to)
	if isinstance(vertex_from, Vertex) and isinstance(vertex_to, Vertex):
	self.vertices[vertex_from.name] = vertex_from.neighbors
	self.vertices[vertex_to.name] = vertex_to.neighbors

	def add_edges(self, edges):
	for edge in edges:
	self.add_edge(edge[0],edge[1])

	def adjacencyList(self):
	if len(self.vertices) >= 1:
	return [str(key) + ":" + str(self.vertices[key]) for key in self.vertices.keys()]
	else:
	return dict()

	def adjacencyMatrix(self):
	if len(self.vertices) >= 1:
	self.vertex_names = sorted(g.vertices.keys())
	self.vertex_indices = dict(zip(self.vertex_names, range(len(self.vertex_names))))
	import numpy as np
	self.adjacency_matrix = np.zeros(shape=(len(self.vertices),len(self.vertices)))
	for i in range(len(self.vertex_names)):
	for j in range(i, len(self.vertices)):
	for el in g.vertices[self.vertex_names[i]]:
	j = g.vertex_indices[el]
	self.adjacency_matrix[i,j] = 1
	return self.adjacency_matrix
	else:
	return dict()

	def graph(g):
	""" Function to print a graph as adjacency list and adjacency matrix. """
	return str(g.adjacencyList()) + '\n' + '\n' + str(g.adjacencyMatrix())

	###################################################################################

	a = Vertex('A')
	b = Vertex('B')
	c = Vertex('C')
	d = Vertex('D')
	e = Vertex('E')

	a.add_neighbors([b,c,e])
	b.add_neighbors([a,c])
	c.add_neighbors([b,d,a,e])
	d.add_neighbor(c)
	e.add_neighbors([a,c])

	g = Graph()
	print(graph(g))
	print()
	g.add_vertices([a,b,c,d,e])
	g.add_edge(b,d)
	print(graph(g))

view raw graphUndirected.py hosted with ❤ by GitHub

Output:

	{}

	{}


	["A:['B', 'C', 'E']", "C:['A', 'B', 'D', 'E']", "B:['A', 'C', 'D']", "E:['A', 'C']", "D:['B', 'C']"]

	[[ 0. 1. 1. 0. 1.]
	[ 1. 0. 1. 1. 0.]
	[ 1. 1. 0. 1. 1.]
	[ 0. 1. 1. 0. 0.]
	[ 1. 0. 1. 0. 0.]]

view raw graphUndirected_output.txt hosted with ❤ by GitHub

Deterministic Selection Algorithm Python Code

July 22, 2016October 14, 2017 Anirudh Technical Algorithms, Code Snippets, Coursera, Deterministic Selection, Median of Medians, MOOC, Python, Tim Roughgarden

Through this post, I’m sharing Python code implementing the median of medians algorithm, an algorithm that resembles quickselect, differing only in the way in which the pivot is chosen, i.e, deterministically, instead of at random.

Its best case complexity is O(n) and worst case complexity O(nlog₂n)

I don’t have a formal education in CS, and came across this algorithm while going through Tim Roughgarden’s Coursera MOOC on the design and analysis of algorithms. Check out my implementation in Python.

	def merge_tuple(a,b):
	""" Function to merge two arrays of tuples """
	c = []
	while len(a) != 0 and len(b) != 0:
	if a[0][0] < b[0][0]:
	c.append(a[0])
	a.remove(a[0])
	else:
	c.append(b[0])
	b.remove(b[0])
	if len(a) == 0:
	c += b
	else:
	c += a
	return c

	def mergesort_tuple(x):
	""" Function to sort an array using merge sort algorithm """
	if len(x) == 0 or len(x) == 1:
	return x
	else:
	middle = len(x)/2
	a = mergesort_tuple(x[:middle])
	b = mergesort_tuple(x[middle:])
	return merge_tuple(a,b)

	def lol(x,k):
	""" Function to divide a list into a list of lists of size k each. """
	return [x[i:i+k] for i in range(0,len(x),k)]

	def preprocess(x):
	""" Function to assign an index to each element of a list of integers, outputting a list of tuples"""
	return zip(x,range(len(x)))

	def partition(x, pivot_index = 0):
	""" Function to partition an unsorted array around a pivot"""
	i = 0
	if pivot_index !=0: x[0],x[pivot_index] = x[pivot_index],x[0]
	for j in range(len(x)-1):
	if x[j+1] < x[0]:
	x[j+1],x[i+1] = x[i+1],x[j+1]
	i += 1
	x[0],x[i] = x[i],x[0]
	return x,i

	def ChoosePivot(x):
	""" Function to choose pivot element of an unsorted array using 'Median of Medians' method. """
	if len(x) <= 5:
	return mergesort_tuple(x)[middle_index(x)]
	else:
	lst = lol(x,5)
	lst = [mergesort_tuple(el) for el in lst]
	C = [el[middle_index(el)] for el in lst]
	return ChoosePivot(C)

	def DSelect(x,k):
	""" Function to """
	if len(x) == 1:
	return x[0]
	else:
	xpart = partition(x,ChoosePivot(preprocess(x))[1])
	x = xpart[0] # partitioned array
	j = xpart[1] # pivot index
	if j == k:
	return x[j]
	elif j > k:
	return DSelect(x[:j],k)
	else:
	k = k - j - 1
	return DSelect(x[(j+1):], k)

	arr = range(100,0,-1)
	print DSelect(arr,50)
	%timeit DSelect(arr,50)

view raw DSelect.py hosted with ❤ by GitHub

I get the following output:

51
100 loops, best of 3: 2.38 ms per loop

Note that on the same input, quickselect is faster, giving us:

1000 loops, best of 3: 254 µs per loop

scikit-learn Linear Regression Example

July 19, 2016July 19, 2016 Anirudh Technical Algorithms, Code Snippets, Coding, example, IPython, Jupyter, Linear Regression, Machine Learning, Python, scikit learn, sklearn

Here’s a quick example case for implementing one of the simplest of learning algorithms in any machine learning toolbox – Linear Regression. You can download the IPython / Jupyter notebook here so as to play around with the code and try things out yourself.

I’m doing a series of posts on scikit-learn. Its documentation is vast, so unless you’re willing to search for a needle in a haystack, you’re better off NOT jumping into the documentation right away. Instead, knowing chunks of code that do the job might help.

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

view raw linear_regression.ipynb hosted with ❤ by GitHub

Randomized Selection Algorithm (Quickselect) – Python Code

July 18, 2016October 14, 2017 Anirudh Technical Algorithms, Python, Quick Sort, Randomized Selection

Find the k^th smallest element in an array without sorting.

That’s basically what this algorithm does. It piggybacks on the partition subroutine from the Quick Sort. If you don’t know what that is, you can check out more about the Quick Sort algorithm here and here, and understand the usefulness of partitioning an unsorted array around a pivot.

Selecting_quickselect_frames — Animated visualization of the randomized selection algorithm selecting the 22^nd smallest value

Python Implementation

	from random import randrange

	def partition(x, pivot_index = 0):
	i = 0
	if pivot_index !=0: x[0],x[pivot_index] = x[pivot_index],x[0]
	for j in range(len(x)-1):
	if x[j+1] < x[0]:
	x[j+1],x[i+1] = x[i+1],x[j+1]
	i += 1
	x[0],x[i] = x[i],x[0]
	return x,i

	def RSelect(x,k):
	if len(x) == 1:
	return x[0]
	else:
	xpart = partition(x,randrange(len(x)))
	x = xpart[0] # partitioned array
	j = xpart[1] # pivot index
	if j == k:
	return x[j]
	elif j > k:
	return RSelect(x[:j],k)
	else:
	k = k - j - 1
	return RSelect(x[(j+1):], k)

	x = [3,1,8,4,7,9]
	for i in range(len(x)):
	print RSelect(x,i),

view raw RSelect.py hosted with ❤ by GitHub

Computing Work Done (Total Pivot Comparisons) by Quick Sort

July 13, 2016July 14, 2016 Anirudh Technical Algorithms, Math, Pivot Comparisons, Python, Quick Sort, Tim Roughgarden

A key aspect of the Quick Sort algorithm is how the pivot element is chosen. In my earlier post on the Python code for Quick Sort, my implementation takes the first element of the unsorted array as the pivot element.

However with some mathematical analysis it can be seen that such an implementation is O(n²) in complexity while if a pivot is randomly chosen, the Quick Sort algorithm is O(nlog₂n).

To witness this in action, one can measure the work done by the algorithm comparing two cases, one with a randomized pivot choice – and one with a fixed pivot choice, say the first element of the array (or the last element of the array).

Implementation

A decent proxy for the amount of work done by the algorithm would be the number of pivot comparisons. These comparisons needn’t be computed one-by-one, rather when there is a recursive call on a subarray of length m, you should simply add m−1 to your running total of comparisons.

3 Cases

To put things in perspective, let’s look at 3 cases. (This is basically straight out of a homework assignment from Tim Roughgarden’s course on the Design and Analysis of Algorithms).
Case I with the pivot being the first element.
Case II with the pivot being the last element.
Case III using the “median-of-three” pivot rule. The primary motivation behind this rule is to do a little bit of extra work to get much better performance on input arrays that are nearly sorted or reverse sorted.

Median-of-Three Pivot Rule

Consider the first, middle, and final elements of the given array. (If the array has odd length it should be clear what the “middle” element is; for an array with even length 2k, use the k^th element as the “middle” element. So for the array 4 5 6 7, the “middle” element is the second one —- 5 and not 6! Identify which of these three elements is the median (i.e., the one whose value is in between the other two), and use this as your pivot.

Python Code

This file contains all of the integers between 1 and 10,000 (inclusive, with no repeats) in unsorted order. The integer in the i^th row of the file gives you the i^th entry of an input array. I downloaded this file and named it QuickSort_List.txt

You can run the code below and see for yourself that the number of comparisons for Case III are 138,382 compared to 162,085 and 164,123 for Case I and Case II respectively. You can play around with the code in an IPython / Jupyter notebook here.

	#!/usr/bin/env

	# Case I
	# First element of the unsorted array is chosen as pivot element for sorting using Quick Sort


	def countComparisonsWithFirst(x):
	""" Counts number of comparisons while using Quick Sort with first element of unsorted array as pivot """
	global count_pivot_first
	if len(x) == 1 or len(x) == 0:
	return x
	else:
	count_pivot_first += len(x)-1
	i = 0
	for j in range(len(x)-1):
	if x[j+1] < x[0]:
	x[j+1],x[i+1] = x[i+1], x[j+1]
	i += 1
	x[0],x[i] = x[i],x[0]
	first_part = countComparisonsWithFirst(x[:i])
	second_part = countComparisonsWithFirst(x[i+1:])
	first_part.append(x[i])
	return first_part + second_part

	# Case II
	# Last element of the unsorted array is chosen as pivot element for sorting using Quick Sort

	def countComparisonsWithLast(x):
	""" Counts number of comparisons while using Quick Sort with last element of unsorted array as pivot """
	global count_pivot_last
	if len(x) == 1 or len(x) == 0:
	return x
	else:
	count_pivot_last += len(x)-1
	x[0],x[-1] = x[-1],x[0]
	i = 0
	for j in range(len(x)-1):
	if x[j+1] < x[0]:
	x[j+1],x[i+1] = x[i+1], x[j+1]
	i += 1
	x[0],x[i] = x[i],x[0]
	first_part = countComparisonsWithLast(x[:i])
	second_part = countComparisonsWithLast(x[i+1:])
	first_part.append(x[i])
	return first_part + second_part

	# Case III
	# Median-of-three method used to choose pivot element for sorting using Quick Sort

	def middle_index(x):
	""" Returns the index of the middle element of an array """
	if len(x) % 2 == 0:
	middle_index = len(x)/2 - 1
	else:
	middle_index = len(x)/2
	return middle_index

	def median_index(x,i,j,k):
	""" Returns the median index of three when passed an array and indices of any 3 elements of that array """
	if (x[i]-x[j])*(x[i]-x[k]) < 0:
	return i
	elif (x[j]-x[i])*(x[j]-x[k]) < 0:
	return j
	else:
	return k

	def countComparisonsMedianOfThree(x):
	""" Counts number of comparisons while using Quick Sort with median-of-three element is chosen as pivot """
	global count_pivot_median
	if len(x) == 1 or len(x) == 0:
	return x
	else:
	count_pivot_median += len(x)-1
	k = median_index(x, 0, middle_index(x), -1)
	if k != 0: x[0], x[k] = x[k], x[0]
	i = 0
	for j in range(len(x)-1):
	if x[j+1] < x[0]:
	x[j+1],x[i+1] = x[i+1], x[j+1]
	i += 1
	x[0],x[i] = x[i],x[0]
	first_part = countComparisonsMedianOfThree(x[:i])
	second_part = countComparisonsMedianOfThree(x[i+1:])
	first_part.append(x[i])
	return first_part + second_part

	#####################################################################
	# initializing counts
	count_pivot_first = 0; count_pivot_last = 0; count_pivot_median = 0

	#####################################################################
	# Cast I
	# Read the contents of the file into a Python list
	NUMLIST_FILENAME = "QuickSort_List.txt"
	inFile = open(NUMLIST_FILENAME, 'r')

	with inFile as f: numList = [int(integers.strip()) for integers in f.readlines()]
	# call functions to count comparisons
	countComparisonsWithFirst(numList)

	#####################################################################
	# Read the contents of the file into a Python list
	NUMLIST_FILENAME = "QuickSort_List.txt"
	inFile = open(NUMLIST_FILENAME, 'r')

	with inFile as f: numList = [int(integers.strip()) for integers in f.readlines()]
	# call functions to count comparisons
	countComparisonsWithLast(numList)

	#####################################################################
	# Read the contents of the file into a Python list
	NUMLIST_FILENAME = "QuickSort_List.txt"
	inFile = open(NUMLIST_FILENAME, 'r')

	with inFile as f: numList = [int(integers.strip()) for integers in f.readlines()]
	# call functions to count comparisons
	countComparisonsMedianOfThree(numList)
	#####################################################################

	print count_pivot_first, count_pivot_last, count_pivot_median

view raw countComparisons.py hosted with ❤ by GitHub

Quick Sort Python Code

July 8, 2016July 14, 2016 Anirudh Technical Algorithms, Python, Quick Sort, Sorting

Yet another post for the crawlers to better index my site for algorithms and as a repository for Python code. The quick sort algorithm is well explained in the topmost Google search result for ‘Quick Sort Python Code’, but the code is unnecessarily convoluted. Instead, go with the code below.

In it, I assume the pivot to be the first element. You can easily add a function to randomize selection of the pivot. Choosing a random pivot minimizes the chance that you will encounter worst-case O(n²) performance. Always choosing first or last would cause worst-case performance for nearly-sorted or nearly-reverse-sorted data.

	def quicksort(x):
	if len(x) == 1 or len(x) == 0:
	return x
	else:
	pivot = x[0]
	i = 0
	for j in range(len(x)-1):
	if x[j+1] < pivot:
	x[j+1],x[i+1] = x[i+1], x[j+1]
	i += 1
	x[0],x[i] = x[i],x[0]
	first_part = quicksort(x[:i])
	second_part = quicksort(x[i+1:])
	first_part.append(x[i])
	return first_part + second_part

	alist = [54,26,93,17,77,31,44,55,20]
	quicksort(alist)
	print(alist)

view raw quicksort.py hosted with ❤ by GitHub

Also read:
Computing Work Done (Total Pivot Comparisons) by Quick Sort
Karatsuba Multiplication Algorithm – Python Code
Merge Sort

Karatsuba Multiplication Algorithm – Python Code

October 13, 2015October 14, 2015 Anirudh Technical Algorithms, Code Snippets, Coursera, Karatsuba, Math, Python

Motivation for this blog post

I’ve enrolled in Stanford Professor Tim Roughgarden’s Coursera MOOC on the design and analysis of algorithms, and while he covers the theory and intuition behind the algorithms in a surprising amount of detail, we’re left to implement them in a programming language of our choice.

~~And I’m ging to post Python code for all the algorithms covered during the course!~~

The Karatsuba Multiplication Algorithm

Karatsuba’s algorithm reduces the multiplication of two n-digit numbers to at most $n^{\log_23}\approx n^{1.585}$ single-digit multiplications in general (and exactly $n^{\log_23}$ when n is a power of 2). Although the familiar grade school algorithm for multiplying numbers is how we work through multiplication in our day-to-day lives, it’s slower ( $\Theta(n^2)\,\!$ ) in comparison, but only on a computer, of course!

Here’s how the grade school algorithm looks:
(The following slides have been taken from Tim Roughgarden’s notes. They serve as a good illustration. I hope he doesn’t mind my sharing them.)

…and this is how Karatsuba Multiplication works on the same problem:

A More General Treatment

Let $x$ and $y$ be represented as $n$ -digit strings in some base $B$ . For any positive integer $m$ less than $n$ , one can write the two given numbers as

$x = x_1B^m + x_0$
$y = y_1B^m + y_0$ ,

where $x_0$ and $y_0$ are less than $B^m$ . The product is then

$xy = (x_1B^m + x_0)(y_1B^m + y_0)$
$xy = z_2B^{2m} + z_1B^m + z_0$

where

$z_2 = x_1y_1$
$z_1 = x_1y_0 + x_0y_1$
$z_0 = x_0y_0$

These formulae require four multiplications, and were known to Charles Babbage. Karatsuba observed that $xy$ can be computed in only three multiplications, at the cost of a few extra additions. With $z_0$ and $z_2$ as before we can calculate

$z_1 = (x_1 + x_0)(y_1 + y_0) - z_2 - z_0$

which holds since

$z_1 = x_1y_0 + x_0y_1$
$z_1 = (x_1 + x_0)(y_1 + y_0) - x_1y_1 - x_0y_0$

A more efficient implementation of Karatsuba multiplication can be set as $xy = (b^2 + b)x_1y_1 - b(x_1 - x_0)(y_1 - y_0) + (b + 1)x_0y_0$ , where $b = B^m$ .

Example

To compute the product of 12345 and 6789, choose B = 10 and m = 3. Then we decompose the input operands using the resulting base (B^m = 1000), as:

12345 = 12 · 1000 + 345

6789 = 6 · 1000 + 789

Only three multiplications, which operate on smaller integers, are used to compute three partial results:

z₂ = 12 × 6 = 72

z₀ = 345 × 789 = 272205

z₁ = (12 + 345) × (6 + 789) − z₂ − z₀ = 357 × 795 − 72 − 272205 = 283815 − 72 − 272205 = 11538

We get the result by just adding these three partial results, shifted accordingly (and then taking carries into account by decomposing these three inputs in base 1000 like for the input operands):

result = z₂ · B^2m + z₁ · B^m + z₀, i.e.

result = 72 · 1000² + 11538 · 1000 + 272205 = 83810205.

Pseudocode and Python code

	procedure karatsuba(num1, num2)
	if (num1 < 10) or (num2 < 10)
	return num1*num2
	/* calculates the size of the numbers */
	m = max(size_base10(num1), size_base10(num2))
	m2 = m/2
	/* split the digit sequences about the middle */
	high1, low1 = split_at(num1, m2)
	high2, low2 = split_at(num2, m2)
	/* 3 calls made to numbers approximately half the size */
	z0 = karatsuba(low1,low2)
	z1 = karatsuba((low1+high1),(low2+high2))
	z2 = karatsuba(high1,high2)
	return (z210^(2m2))+((z1-z2-z0)*10^(m2))+(z0)

view raw karatsuba_pseudocode.txt hosted with ❤ by GitHub

	def karatsuba(x,y):
	"""Function to multiply 2 numbers in a more efficient manner than the grade school algorithm"""
	if len(str(x)) == 1 or len(str(y)) == 1:
	return x*y
	else:
	n = max(len(str(x)),len(str(y)))
	nby2 = n / 2

	a = x / 10**(nby2)
	b = x % 10**(nby2)
	c = y / 10**(nby2)
	d = y % 10**(nby2)

	ac = karatsuba(a,c)
	bd = karatsuba(b,d)
	ad_plus_bc = karatsuba(a+b,c+d) - ac - bd

	# this little trick, writing n as 2*nby2 takes care of both even and odd n
	prod = ac * 10*(2nby2) + (ad_plus_bc * 10**nby2) + bd

	return prod

view raw karatsuba.py hosted with ❤ by GitHub

Teach Yourself Machine Learning the Hard Way!

October 9, 2015October 12, 2015 Anirudh Non Technical Algorithms, Data Science, Machine Learning, Python

This formula is kick-ass!

Darshan Hegde

It has been 3 years since I have steered my interests towards Machine Learning. I had just graduated from college with a Bachelor of Engineering in Electronics and Communication Engineering. Which is, other way of saying that I was:

a toddler in programming.
little / no knowledge of algorithms.
studied engineering math, but it was rusty.
no knowledge of modern optimization.
zero knowledge of statistical inference.

I think, most of it is true for many engineering graduates (especially, in India !). Unless, you studied mathematics and computing for undergrad.

Lucky for me, I had a great mentor and lot of online materials on these topics. This post will list many such materials I found useful, while I was learning it the hard way !

All the courses that I’m listing below have homework assignments. Make sure you work through each one of them.

1. Learn Python

If you are new to programming…

View original post 507 more words

Magic 5-gon Ring — Project Euler (Problem 68)

September 13, 2015September 13, 2015 Anirudh Non Technical Algorithms, Code Snippets, Coding, Math, Problem, Project Euler, Puzzles, Python

Yet another exciting math problem that requires an algorithmic approach to arrive at a quick solution! There is a pen-paper approach to it too, but this post assumes we’re more interested in discussing the programming angle.

First, the problem:

Working clockwise, and starting from the group of three with the numerically lowest external node (4,3,2 in this example), each solution can be described uniquely. For example, the above solution can be described by the set: 4,3,2; 6,2,1; 5,1,3.

It is possible to complete the ring with four different totals: 9, 10, 11, and 12. There are eight solutions in total.

Total Solution Set:
9 4,2,3; 5,3,1; 6,1,2
9 4,3,2; 6,2,1; 5,1,3
10 2,3,5; 4,5,1; 6,1,3
10 2,5,3; 6,3,1; 4,1,5
11 1,4,6; 3,6,2; 5,2,4
11 1,6,4; 5,4,2; 3,2,6
12 1,5,6; 2,6,4; 3,4,5
12 1,6,5; 3,5,4; 2,4,6
By concatenating each group it is possible to form 9-digit strings; the maximum string for a 3-gon ring is 432621513.

Problem

Using the numbers 1 to 10, and depending on arrangements, it is possible to form 16- and 17-digit strings. What is the maximum 16-digit string for a “magic” 5-gon ring?

Algorithm

In attempting this problem, I choose to label the 5 inner nodes as i, j, k, l, and m.
α, β, γ, δ, and θ being the corresponding outer nodes.

Let x be the sum total of each triplet line, i.e.,

x = α + i + j = β + j + k = γ + k + l = δ + l + m = θ + m + i

First Observation:
For the string to be 16-digits, 10 has to be in the outer ring, as each number in the inner ring is included in the string twice. Next, we fill the inner ring in an iterative manner.

Second Observation:
There 9 numbers to choose from for the inner ring — 1, 2, 3, 4, 5, 6, 7, 8 and 9.
5 have to be chosen. This can be done in 9C5 = 126 ways.
According to circular permutation, if there are n distinct numbers to be arranged in a circle, this can be done in (n-1)! ways, where (n-1)! = (n-1).(n-2).(n-3)…3.2.1. So 5 distinct numbers can be arranged in 4! permutations, i.e., in 24 ways around a circle, or pentagonal ring, to be more precise.
So in all, this problem can be solved in 126×24 = 3024 iterations.

Third Observation:
For every possible permutation of an inner-ring arrangement, there can be one or more values of x (triplet line-sum) that serve as a possible contenders for a “magic” string whose triplets add up to the same number, x. To ensure this, we only need that the values of α through θ of the outer ring are distinct, different from the inner ring, with the greatest of these equal to 10.
Depending on the relative positioning of the numbers in the inner ring, one can narrow the range of x-values one might have to check for each permutation. To zero-down on such a range, let’s look at an example. Shown in the figure below is a randomly chosen permutation of number in the inner ring – 7, 2, 3, 4 and 5, in that order.

So 10, 9, 8, 6 and 1 must fill the outer circle. It’s easy to notice that the 5, 7 pair is the greatest adjacent pair. So whatever x is, it has to be at least 5 + 7 + 1 = 13 (1 being the smallest number of the outer ring). Likewise, 2, 3 is the smallest adjacent pair, so whatever x is, it can’t be any more than 2 + 3+ 10 = 15 (10 being the largest number of the outer ring). This leaves us with a narrow range of x-values to check – 13, 14 and 15.

Next, we arrange the 5 triplets in clock-wise direction starting with the triplet with the smallest number in the outer ring to form a candidate string. This exercise when done for each of the 3024 permutations will shortlist a range of candidates, of which, the maximum is chosen.

That’s all there is to the problem!

Here’s the Python Code. It executes in about a tenth of a second!

	from itertools import permutations
	from itertools import combinations
	# array of candidate solutions empty at the beginning
	record = []

	# choose 5 numbers for inner cells between 1 and 9; there are 9C5 combinations
	# the problem ask for a 16-digit number, so 10 is not to be included in inner cells
	cells = range(1,10)
	inner_cells = [map(int,comb) for comb in combinations(cells,5)]

	# code to calculate min and max couple in an array
	def minCouple(array):
	answer = array[0]+array[-1]
	for i in xrange(len(array)-1):
	coupleSum = array[i] + array[i+1]
	if coupleSum < answer:
	answer = coupleSum
	return answer

	def maxCouple(array):
	answer = 0
	for i in xrange(len(array)-1):
	if i==0:
	coupleSum = array[0]+ array[-1]
	if coupleSum > answer:
	answer = coupleSum
	else:
	coupleSum = array[i]+ array[i+1]
	if coupleSum > answer:
	answer = coupleSum
	return answer

	# Algorithm
	for array in inner_cells:
	pivot = array[0]
	perm_array = array[1:]
	perms = [map(int,perm) for perm in permutations(perm_array,4)]
	for perm in perms:
	checkArray = perm
	checkArray.insert(0,pivot)
	outerRing = [el for el in range(1,11) if el not in checkArray]
	xMax = minCouple(checkArray) + max(outerRing)
	xMin = maxCouple(checkArray) + min(outerRing)
	if xMax >= xMin:
	for x in xrange(xMin, xMax+1):
	i = checkArray[0]
	j = checkArray[1]
	k = checkArray[2]
	l = checkArray[3]
	m = checkArray[4]

	alpha = x-i-j
	beta = x-j-k
	gamma = x-k-l
	delta = x-l-m
	theta = x-m-i

	outerCalculated = [alpha, beta, gamma, delta, theta]

	if sorted(outerCalculated) == sorted(outerRing):
	a = [alpha, i, j]
	b = [beta, j, k]
	c = [gamma, k, l]
	d = [delta, l, m]
	e = [theta, m, i]
	min_val = min(alpha, beta, gamma, delta, theta)
	if alpha == min_val:
	append = a+b+c+d+e
	elif beta == min_val:
	append = b+c+d+e+a
	elif gamma == min_val:
	append = c+d+e+a+b
	elif delta == min_val:
	append = d+e+a+b+c
	elif theta == min_val:
	append = e+a+b+c+d
	l = [str(i) for i in append]
	s = ''.join(l)
	integer_list = int(s)
	record.append(integer_list)
	print max(record)

view raw euler68.py hosted with ❤ by GitHub

Ans: 6531031914842725

Discovering Python & R

— my journey as a worker bee in quant finance

Algorithms

Abu Mostafa’s Machine Learning MOOC – Now on EdX

Implementing Undirected Graphs in Python

Deterministic Selection Algorithm Python Code

scikit-learn Linear Regression Example

Randomized Selection Algorithm (Quickselect) – Python Code

Computing Work Done (Total Pivot Comparisons) by Quick Sort

Quick Sort Python Code

Karatsuba Multiplication Algorithm – Python Code

Example

Teach Yourself Machine Learning the Hard Way!

Magic 5-gon Ring — Project Euler (Problem 68)

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Example

Share this:

Share this:

Share this: