Some great fun with convolutions

Convolutions, I realised recently, loosely mean ‘do matrix multiplications over an entire image’, which hit me like a soft plastic hammer as I muttered a small ‘oh’ in the corner of a coffee shop nearby.

The cool thing seems to me to be the fact that the ‘image’ is actually an image as understood by my ageing laptop (and anyone else’s, really), as a bunch of numbers in a 2d array (or 3d array, if it is in colour). so convolutions are.. matrix multiplications in 2d. hmm.. I guess that means convolutions also happen in 1d and 3d and stuff…(what happens when you convolve an ASCIIgrid dataset of the world?) anyway here’s a cat:

cutest_kitten

google image search wizardry has decided that she is the world’s cutest kitten. (source)

so what sort of things do you do to the kitten? in fact, it seems a lot of filters that we use in Photoshop are convolutions. so it’s a filter? yes, and it’s also a kernel. google also says kernel is a small matrix that you plaster all over the kitty to make it different (no, not better, just.. different).

try doing this tutorial that does kitty transformation in tensorflow (compared to writing your own loops, because it reads better as a numpy array than as a nested list). Or go change the numbers in this one. below is what I got from the tf tutorial:

blur

blur kitty

sharpencat

sharp kitty

random effect kitty

some randomly generated kernels. yes they’re kinda creepy. sorry kitty.

 

So…….

What are Convolutional Neural Networks then?

They’re kernels that learn to write themselves.

This one hit me like an iron hammer, and I was happy for quite a while after that.

 

What? Why do neural networks want to write their own blur filters? Oh no no its not a filter to the network, because of the way you train them. I ignominiously coin them Intelligence filters. haha. Filters in Convolutional Neural Networks learn what is the best way to ‘see’ something so that it can make a good guess at what it is.

In a Convolutional Neural Network, you stick filters layered in front of each other, each representing more and more of the portion of the image  (actually you stick neurons arranged like filters)  in front of your regression / classification network and train them with backpropagation (very loosely speaking of course).

fig1

source

so if you look at the above image, what is going on is:

input image > squeeze > squeeze > squeeze > squeeze > connect them all together > classify

The backpropagation goes all the way through, so that the filters that do the squeezing (extracting the ‘essense’ of a small part of the image) learns where it went wrong and where to change to make itself less wrong.

As a result, after training it learns which bits are important (almost white, because white means 1 here, which means activated), and because the neurons are laid out in 2d and it’s learning about images, we humans get to see which feature of the image is the neuron interested in.

I haven’t actually trained my own CNN yet, because I like knowing things in as much detail as I can before doing it, but when I do i’ll update this post!

follow along this tutorial, it’s really quite fun and illuminating. This too.

Advertisements

How to use Flux

TLDR : sign up for flux, click site extractor, pick region, press extract, open your program, download from flux, have site model (with caveats).

flucx

edit Mar 9 2018: seems like flux is being shut down. if you have the stomach for it, have a look at my home brewed version

Continue reading “How to use Flux”

Owl > Tensorflow > Owl!

tf with hyperparams.gif

With the help of Mateusz (creator of Owl), there’s now a live updated window showing tensorflow training live in grasshopper! the graph in the end was matplotlib’s plot.

tf with hyperparams_cropped.gif

Closeup. Basically shows the terminal as it runs. A previously homemade python version also runs the python script from command line, but sorta just freezes the gh canvas while it trains. Wasn’t very exciting.

Untitled

An excerpt from the python script running behind this (only showing the fun part). Runs on Keras with hyperparameters passed from grasshopper.

Kaggle datasets in Rhino

datasets

melbourne housing dataset.csv from kaggle.

this was using that older dataset (link to kaggle) with 9 features. coordinates were polled from Google’s geocoding API based on the address in the dataset.  I believe the updated dataset provides coordinates too, possibly using the same method described.

number of rooms

showing different ranges of number of rooms per unit

kopt

‘if i’m looking for houses that are between 5 to 7 rooms, the newest ones are the light yellow ones on the northeast edge of the city.’

other details that are inherent by linking an excel datasheet to coordinates in the real world:

contour

height of property, contour of surrounding land

watershed

flow of water through property and general direction of water flow. (flash floods, landslides possibly?)

street view

pictures around the site (from Google Street View)

google directions

of course, different methods of transport to and from the city

clouds

weather data at a given time (historical data is paywalled, so i couldn’t access it)

soil_cropped.gif

soil conditions and suitability for certain forms of construction

aussie

…and some really zoomed out GIS level datasets (i’m still wondering what to do with them.)

Owl in Galapagos

TLDR: predicting rectangles with two neural networks and galapagos.

 

galapagos on owl 3

galapagos used to discover the best shapes for two time series neural networks

just realised that galapagos would potentially be very useful (or actually, another backprop NN might be even faster) for testing out optimal ‘window’ sizes for a time series neural network (the ‘view’ that the neural network sees when it learns to predict a number series).

When predicting with a time series neural network, one of the problems that bugged me has been that we don’t know what the best size is for prediction (called look_back in this tutorial). Too small and the NN learns that it should only go up or down, too large and it misses out on too many details.

This is where galapagos (or any other appropriate learner) comes in to help find the optimal range within which the best predictions can happen.

Galapagos was used to test 7 parameters that directly affected the neural network shape and learning rate (1 for window size, 3 for each NN : number of hidden neurons, learning rate, and steepness of the sigmoid activation function).

param1

after 15 minutes or so, it gave me some pretty decent answers for the parameters required for learning two separate lists of parameters.

It was quite interesting to see that the learning rate varied quite a bit between the two (one was at 0.21, and another at 0.62), and alpha( used to define the steepness of the sigmoid activation function) was at 1.344 and 0.887 respectively (and then i realised that in fact learning rate is inversely proportional to alpha).

The number of hidden neurons (defining the steepness of the sigmoid activation function) stayed relatively similar at hidden = 4 neurons and 5 neurons respectively. but then, i wouldn’t have guessed if i just used a random middling number between inputs and outputs.

param2

the resulting prediction was a prediction of a series of two parameters that define a rectangle.

Ground truth dataset in Grey, predicted dataset in Yellow.

galapagos on owl 4

the accuracy falloff after training

galapagos on owl 5 initial

before training

galapagos on owl 5 initial2

initial hand tweaking of parameters (didn’t know which ones are best to tweak)

galapagos on owl 5 learnt

so i machine learned those parameters and it got some pretty decent predictions

galapagos on owl 5 shifted

and shifted some starting rectangles and realised it predicts about up to 10 rectangles reliably enough before doing some crazy things.

plugins used: OWL, galapagos

Panel Rationalization (OWL)

never has panel rationalization been so straighforward! k-means clustering to sort similar panels!

panels are sorted by two parameters:

  • panel area
  • surface normals

and then replaced with set panel dimensions (an average of each cluster) + 20mm offset. the results are pretty decent, with minimal overlap even at the steep bits of the surface.

panel types

number of clusters (types of rectangles) from 2 – 50, iterations = 3

clustering iterations

k (number of clusters) = 25, iterations running from 1 – 30

EDIT : a little extra definition showing colour clustering to reduce the number of colour variations needed from 1124 to 10-50.

18217830_10154861225159064_1385605154_n

colour variations from 10-30, 30 iterations

18197623_10154861722884064_1987930800_n

10 colours, iterations from 10-30, showing how the clustering works in realtime

plugins used: OWL

Machine Learning with OWL

Just attended a workshop last week on machine learning in grasshopper, and here are some results!

clustering final

slightly edited version of the final presentation board

The above is a combination of a few techniques in machine learning, used to find clusters of correlated sites in Dubai, based on a given combination of parameters.

here’s the breakdown:

The intention was to find out different hotspots in Dubai, based on a few parameters that affect the popularity of the given site, and then to create a set of street furniture that would be contextually sensitive to the site.

graph dark

here’s a pretty relationship graph to whet your appetite

1. Mapping the popularity of places in Dubai

Circular_Graph_Places_Properties

graph of all the parameters used and their numerical values in a circle

Mapping the popularity of areas in Dubai utilizes k-means clustering to find groups of places based on a few factors :

  • ratings (taken from the Google Places API)
  • number of reviews and their scores
  • the security presence on site (to gauge how private a given building is)
  • building capacity (size)
  • type of building (commercial, residential, utility, etc)
  • distance of that building from other buildings (to gauge the density of the area)

unnamed

k-means clustering and their averaged characteristics of the clusters (e.g. in the light blue cluster, the metro stations, power plant, burj al arab, and the police station generally have a strong security presence, get a rating of about 4.3 stars, and for some reason are considered small sized buildings in relation to other sites)

 

2. Design of chairs that would directly correlate to the clusters on the map

a parametric model of a street bench is created (the old fashioned way, in grasshopper) with a set of 12 parameters defining width, divisions, backrest height etc, and then run through an autoencoder to reduce its parametric dimensionality to 2.

this means that with two sliders, one is able to create a set of street furniture that captures the relationships of all 12 parameters that are used to create the said street furniture.

this also means that by creating a 2d grid of points, one can see the entire series of permutations of the design in question, be it a chair, a house, or a city.

3 versions of chairs were defined by the designer (well, someone from our team) and fed into the autoencoder to find out what are the strongest correlations between all three designs (the designs themselves are at the top left, bottom left, and bottom right corners of the graphic below).

these correlations are then fed back into the trained autoencoder network to ‘decode’ the relationships between the 3 objects. hence all permutations between these three given designs define some parts of the characteristics of each designed object.

say, the top left design is a simple long bench, bottom left a short bench, and bottom right a large circular bench with backseats and divisions in between them. The network then finds a set of ‘morphed’ objects that each have a bit of ‘division-ness’, ‘long-ness’, ‘backrest-ness’ in between them.

then the entire set is run through another k-means clustering algorithm to find out which version of seating is most suitable for which areas in Dubai, based on a different set of related parameters, this time being amount of seating area, number of divisions, and bench length: e.g. Dubai Mall and Emirates Mall have the highest traffic (gauged by the number of reviews and size of the building), so they would require seatings with the largest amount of area.

chair

the graphic above shows the entire array of all possible permutations of the street furniture that fit within our given definition.

I hope you would see the potential of using machine learning in architecture, as more than ever, it allows real, big data to be directly linked to design in a way that is not simply a designer’s intuition. This technique can be applied to include all the parameters that a modern building should have, like sustainibility targets, cost, environmental measures, passivhaus implementations, fire regulations.. the list is endless. We as architects should start incorporating them into design automatically so that we give ourselves more time to do the things we enjoy, like conceptual input, or our own flair in design.

the above statement doesn’t apply to people who really really enjoy wrangling with fire regulations and sustainability issues manually, or believe in the nostalgia of the architect-as-master-builder who is able to handle all the different incoming sources of requirements and create something that satisfies them ALL. disclaimer: i’m not one of them.

P.S. oh yes, and since we had a bit of free time during the workshop, and we didn’t know what to call our project, we decided to machine learn our team name by using a markov chain network.

project-name.gif

the project name. it’s a set of slightly random babbling that the definition below spit out after reading an article about machine learning in wikipedia. kinda sounds like english though, so it’s all fine.

project name process

plugins used: OWL, ghpython, anemone