This post is not going to be very maths-heavy but will include some concepts from the area of graph theory which we will use in an unusual setting.

Today we’re going to get into the virtual kitchen and get our virtual taste buds a-buzzing….Well, actually we’re going to get our computers to do the hard work of finding out some tasty recipes and we’re going to use a few different techniques to do this. The workhorse is going to be the Mathematica programming language, but a lot of what we will do will be doable in any programming language, though perhaps it will be slightly more cumbersome than this.

The starting point of our culinary adventure will be a piece of text, taken from a website of flavour pairings here.

We see here that we have a piece of text which we can copy and paste into our Mathematica file. The text starts:

  • Allspice pairs well with: apples, beets, cabbage, caramel, cardamom, cinnamon, cloves, coriander, ginger, juniper, mace, mustard, nuts, nutmeg, onions, pears, pumpkin, root vegetables, yams
  • Almond pairs well with: apple, apricot, banana, caramel, cherry, coffee, fig, honey, orange, peach, pear, plum
  • Anice pairs well with: apples, beets, caramel, carrots, chocolate, citrus, cinnamon, coconut, coriander, cranberry, fennel, figs, fish, garlic, peaches, pomegranates, pumpkin
  • Apple pairs well with: caramel, cardamom, chestnut, cinnamon, cranberry, currant, ginger, hazelnut, mango, maple, rosemary, walnut
  • Apricot pairs well with: almond, black pepper, caramel, cardamom, ginger, hazelnut, honey, orange, peach, vanilla, plum
  • Asian Pear pairs well with: ….
  • and so on…

The aim of this exercise will be to take this data and try and write some code which will automatically generate recipes for us (at least in terms of the combinations of ingredients which will go well together).

At the moment the whole thing is one very long string (data made up of symbols which don’t have a specific meaning as a command in a programming sense), and we would like to split the string up into more bite-sized (haha!) chunks which we can put into a lovely visual form and then pull out recipes from it.

I will add a second post regarding exactly how we will do this, but the topics which will be needed are:

  • Text processing including regular expressions and edit distance (a concept from the world of natural language processing)
  • Functional and symbolic programming
  • Concepts from Graph Theory

We will show in the detailed post that we can extract from the text all the details about which food goes well with which in a simple matrix form. How can we display this graphically? Well, we can think that every ingredient is a point, and every statement “A goes well with B” is a line joining ingredients (points) A and B. Thus, the information is well represented as a graph. Not a graph in the sense of a graphical representation of a function from x to y (f:\mathbb R\rightarrow \mathbb R), but a graph in the sense of a series of vertices (also called nodes) and lines joining them .

Graphs are ubiquitous in the natural description of many areas of life, from computer networks to real life social networks, from evolutionary biology to, apparently, gastronomy! (Postscript: Apparently something similar has been done before – see here)

Let’s look at how all the data looks if we plot it as a single graph

foodgroups3well this is useful for neither chef nor beast…

In fact the total number of ingredients in the list is too large to plot the whole thing on a single page, and to be able to read the details, so let’s plot just a subset of them (also known as a subgraph). Here are a few ingredients (randomly chosen) displayed in a graph:

foodgroups4

Remember, each line (also called an edge in graph theory) joining two ingredients (vertices) means that they go well together. So caramel and pear go well together and coconut and curry leaf go well together. Remember that this is simply a small sampling of ingredients and links, and so there are links here which should be there in the full data but are not. For instance pear and vanilla go well together.

So, what is a recipe? Well, we can think (in a very simplified way) that a recipe is a set of ingredient where each one goes well with each other. See the video of Grant Achatz from Alinea here:

What does this mean in the sense of the graph? Well, it will be a set of points where each point is connected to every other in the set. This can also be described as a totally connected subgraph, also known as a clique.

Thus, if we find the cliques in the graph, we will have found recipes!

We can ask for cliques of a given size, so, for instance the following graph has a single clique with 3 vertices:

foodgroups2Here you can see that mandarin goes well with nutmeg, mandarin goes well with cinnamon and cinnamon goes well with nutmeg. Three ingredients which go well together and thus a recipe of three ingredients is formed…ok, so not quite a recipe, but at least a list of ingredients which can become the basis of a recipe.

However, to pull out the useful information, ie the cliques, we don’t need to be able to look at the data in graph form, we simply need to be able to ask for cliques, and luckily in Mathematica, this is pretty easy. We simply write

FindCliques[graph, {clique_size_you_want}, All]

…easy as that!

For instance, we can ask for cliques with 7 elements in, and (a subset of) the answers are:

  • onion, fennel, dill, basil, thyme, oregano , parsley
  • rosemary, basil, marjoram, thyme, mushrooms , oregano, parsley
  • fennel, garlic, dill, basil , thyme, oregano, parsley
  • caraway, paprika, garlic, dill, thyme, oregano, parsley
  • allspice, apple, cardamom, cilantro, clove, coriander, ginger
  • cardamom, cilantro, clove, coriander, cumin , ginger, curry
  • cardamom, cilantro, clove, coriander, cumin, ginger, citrus
  • allspice, cardamom, cilantro , clove, coriander, cumin, ginger
  • allspice, apple, caramel, cardamom, cinnamon, ginger, pear
  • cardamom , cinnamon, clove, coriander, cumin, ginger , curry
  • allspice, apple, cardamom, cinnamon, clove , coriander, ginger
  • chocolate, cinnamon, clove, ginger , nutmeg, orange, vanilla

Actually Mathematica only finds cliques like this which are not subgraphs of larger cliques (known as maximal cliques), so we have to work a little harder to find all 40 or so cliques of size 7.

Well, that’s kinda neat – we’ve just found sets of 7 ingredients, all of which go well with one another. For any cooks out there, these combinations won’t be particularly surprising, though the allspice, apple, cardamom, cilantro, clove, coriander, ginger combination is perhaps a little unusual and sounds quite a bit like a variation of a Vietnamese salad.

We can do better than this. We can ask for cliques which contain certain ingredients, or even combinations of ingredients. For instance we can ask for cliques with vanilla and lemongrass. Apparently there are two cliques with these ingredients, the largest of which is chocolate, coconut, ginger, vanilla, lemongrass which sounds pretty interesting!

We can also ask about ingredients which might go badly together – these may be thought of as those ingredients which have the longest path in between them. Given a graph g,  we can find these very simply with the command:

longestpath=

With[{vl=VertexList[g]},Table[FindShortestPath[g, vl[[m]], vl[[n]]]//Length, {m, Length[vl]}, {n, Length[vl]}]//Max]

With[{vl=VertexList[g]},{vl[[#[[1]]]], vl[[#[[2]]]]}&/@Position[Table[FindShortestPath[g, vl[[m]], vl[[n]]]//Length, {m, Length[vl]}, {n, Length[vl]}],longestpath]]

Don’t worry about the syntax here if you are not familiar with Mathematica – this is just asking for the maximal shortest path between ingredients.

This gives a large number of different pairs of ingredients which have 4 ingredients in between them. For instance apparently eggplant and white chocolate wouldn’t make a great pairing according to the following:

foodgroups5

However, I’m not completely convinced about some of the supposedly bad combinations.  For instance, apparently maple and watermelon are very bad flavour pairings. This certainly doesn’t seem like a bad combination to me and so I think that we are seeing the limitation with the relatively small number of flavour pairings shown in the text. We would need a more comprehensive flavour pairing list in order to investigate this further.

There are some rather surprising combinations of flavours which go well together. For instance, caviar and white chocolate is supposed to be a good pairing due to the presence of certain amine compounds in both of them.

Ok, so these are some of the simple things we can play around with. In the next post I will show how we actually coded this in Mathematica using concepts like Editdistance and some simple graph theoretic commands.

Part 2 can be found here.

How clear is this post?