Chapter 7: Functions and methods

Julia is what is described as a functional programming language, meaning that functions are the principal building blocks of a Julia program (as opposed to objects and their instances in OOP). Introducing functions is the last part we are missing before we can start building fully-fledged applications to solve real world problems. Let's get cracking!

Syntax and arguments

General syntax and invocation

There are two general ways to define a function. The first way is usually suited for simple, single-expression functions, while the second way is more suitable for longer functions that include multiple expressions.

Single expression functions

Single expression functions are written very similarly to their mathematical form:

    julia> geom_average(a,b) = sqrt(a^2 + b^2)
    geom_average (generic function with 1 method)

    julia> geom_average(3,4)
    5.0

Multiple expression functions

If your function is more complex, and needs to evaluate multiple functions, this syntax is no longer suitable. The syntax to use in such cases uses a block, introduced by function and terminated by end, to describe the function:

    julia> function breakfast(pancakes, coffee)
               println("$coffee cups of coffee and $pancakes pancakes, please.")
           end
    breakfast (generic function with 1 method)

    julia> breakfast(2,4)
    4 cups of coffee and 2 pancakes, please.

Return values

In general, Julia returns the last value to come from the last calculation within the block:

    julia> function dinner(sausages, mash)
               cost_of_sausages = sausages * 0.85
               cost_of_mash = (mash == true ? 0.60 : 0.00)
               cost_of_sausages + cost_of_mash
           end
    dinner (generic function with 1 method)

    julia> dinner(2, true)
    2.3

While we haven't told Julia what we exactly want the function to return, it infers that it would probably be the result of the last calculation (cost_of_sausages + cost_of_mash).

Now imagine that the fictitious canteen, who are so keen on calculating the cost of sausages and mash for dinner, get back to you and want the function to be changed. They are, it turns out, only interested in the cost of sausages. You could simply put cost_of_sausages to the very end of the function, before the end keyword, or you could use the return keyword, which will tell the function what to give back. Let's redefine dinner(sausages, mash) to fit the canteen's expectations using the return keyword:

    julia> function dinner(sausages, mash)
               cost_of_sausages = sausages * 0.85
               cost_of_mash = (mash == true ? 0.60 : 0.00)
               return cost_of_sausages
           end
    dinner (generic function with 1 method)

    julia> dinner(2, true)
    1.7

As a matter of style, return is a good idea to use, even if the function would return the right value. Whoever ends up debugging the script will be grateful you told them what exactly a function ends up returning.

Variable numbers of positional arguments: ... ('splats')

The above simple function had a definite number of arguments that had to be in a particular order. Arguments where the identity of the particular argument is determined by its position among the arguments are called positional arguments – so in the example above, Julia knew the argument 2 related to sausages, not mash, because that's the position in which it was defined. But what if you don't know how many inputs you are likely to get for a particular function? Let us imagine a function, called shout(), that shouts the patrons' orders back to the short-order cook. Some customers want a long list of items, others just one or two.

One way to implement this is to expect an array argument:

    function shout(food_array)
        food_items = join(food_array, ", ", " and ")
        println("Get this guy $food_items\!")
    end

Invoking this with two arguments, we get

    julia> shout(["some pancakes", "sausages with gravy"])
    Get this guy some pancakes and sausages with gravy!

These are returned to the function as a tuple listing each element covered by the splats.

What, however, if someone has more of an appetite? Our shout() function can accommodate it - the array just needs to get larger. This approach is perfectly viable, but regarded as a bit clumsy. What if I forget the array square brackets, for instance?

    julia> shout("pancakes")
    Get this guy p, a, n, c, a, k, e and s!

Well, that's not quite what I wanted! Fortunately, Julia allows us to have not merely multiple arguments but indeed an indefinite number. We effect this by suffixing the variable we wish to hold the positional arguments with three full stops ..., also known as a 'splat':

    function shout_mi(foods...)
        food_items = join(foods, ", ", " and ")
        println("Get this guy some $food_items\!")
    end

Now our function performs perfectly, whether our customer is ravenous or he just wants some pancakes:

    julia> shout_mi("pancakes")
    Get this guy some pancakes!

    julia> shout_mi("pancakes", "sausages", "gravy", "a milkshake")
    Get this guy some pancakes, sausages, gravy and a milkshake!

What, however, if our customer does not seem to say anything? We would expect this to raise an error... but it doesn't:

    julia> shout_mi()
    Get this guy some !

Therefore, we need be mindful of using a splatted positional argument right in the beginning, since it will accept the input of, well, no input! A common way to fix this is to require one positional argument, then add a splatted second argument. This way, if the function is called with no arguments at all, it will raise an error. A better way, perhaps, is to simply test for it ourselves:

    function bulletproof_shout(foods...)
        if length(foods) > 0
            println("Get this guy some $(join(foods, ", ", " and "))\!")
        else
            error("The customer needs to order something!")
        end
    end

As you recall, the foods variable is passed on to us as a tuple. We can use the length() function on this tuple (although do note, we do test explicitly for length(foods) > 0: a result of zero would not be 'falsey', so testing simply for if length(foods) would not cut it!), which will indicate how many elements the tuple has and raise an error if it is zero:

    julia> bulletproof_shout("sausages", "pancakes", "gravy")
    Get this guy some sausages, pancakes and gravy!

    julia> bulletproof_shout("sausages")
    Get this guy some sausages!

    julia> bulletproof_shout()
    ERROR: The customer needs to order something!
     in bulletproof_shout at none:5

Finally, the function works. The last marginal case that you might want to deal with is when the customer's order consists of an empty string "" or is the wrong type. These are further marginal cases and will not be explored here (although we will be looking at user input in quite a bit of detail in the second part of the book). The take-away is this - a good function (one you would let your grandmother use) needs to cater for a range of marginal cases and inputs. Splats are, however, somewhat performance-consuming and are best avoided in code that needs to run fast. In such situations, usability and performance need to be weighed and balanced.

Optional positional arguments

Positional arguments may be 'optional'. This does not mean they are not used - they are optional only from the user's perspective, who will not be required to enter them. A perhaps better way to put this is that these arguments have default values that take effect if they are not provided at invocation. Consider the following function, which accepts 2D as well as 3D coordinates, and sets 2D coordinates, by default, on the z = 0 plane:

    function coords(x, y, z = 0)
        return(x,y,z)
    end

The result:

    julia> coords(1, 6, 7)
    (1,6,7)

    julia> coords(3, 1)
    (3,1,0)

Setting defaults allows you to prevent the inevitable error that would be triggered if z = 0 were not provided for. Consider, for instance, what would happen if the value for y, for which no default value has been set, were to be missing:

    julia> coords(3)
    ERROR: `coords` has no method matching coords(::Int64)

Julia is telling us, in its somewhat odd grammar, that the function coords() is not defined for a single input.

Keyword arguments

The drawback of positional arguments is that getting the order right can be an inconvenience. Wouldn't it be much easier, not the least from a documentation perspective, if we were allowed to give arguments names and use these names at invocation? With Julia, you can do so at your heart's content, as long as you put them at the end of your variables when defining the function and delimit keyword arguments from non-keyword arguments with a semicolon ;, as in this snippet:

    function buzzphrase(verb, adjective; subject="defence", goal="world peace")
        println("$(verb)ing $adjective $subject for $goal.")
    end

In this function, verb and adjective are necessary positional arguments. However, you can use the keyword argument syntax for subject and goal. As you can see, both have defined default values – this is necessary for keyword arguments in Julia. Thus,

    buzzphrase("leverag", "effective", subject="best practices", goal="increased margins")

is equivalent to

    buzzphrase("leverag", "effective", goal="increased margins", subject="best practices")

and yield the same results.

Stabby lambda functions: ->

Sometimes, you're in a hurry and need a throwaway function. Whether it's for mapping an Array or comparing values in a sort, sometimes you don't want to define a function. A number of languages refer to these as anonymous functions, because they do not have a defined name, or reserve a lambda keyword for this, harkening back to Alonzo Church's 'lambda calculus' well before the advent of modern computers. Julia has a stylised arrow ->, leading to the name stabby lambda for such functions.

Assume you want to map the array of all primes under 10 [2,3,5,7] to a function f so that f(x) = 2x^3 + x^2 - 2x + 4. In case you're unfamiliar with map() functions, here's the elevator pitch: map functions take a function and an iterable and return an iterable of equal length, each element of which will be the result of feeding an element of the original iterable into the function. In Julia, map() takes two arguments - a function and an iterable. For the former, you can use a function defined in advance or use the stabby lambda notation that is the subject of this section.

The mapping function would be written in the stabby lambda notation as

    x -> 2x^3 + x^2 - 2x + 4

somewhat similar to the maplet notation in mathematics. Thus, we would use the map function as follows:

    julia> map(x -> 2x^3 + x^2 - 2x + 4, [2, 3, 5, 7])
    4-element Array{Int64,1}:
      20
      61
     269
     725

The stabby lambda is a little controversial, being even discouraged where it serves as a mere wrapper by the official Julia Style Guide, for the reason that such functions are impossible to unit test and can make code confusing. In general, the advice that is often given to, and by, Python programmers about lambdas in Python holds for their stabby Julia equivalents: a stabby lambda should be obviously and unambiguously true, that is, it should be evident at first glance

  • what it does,
  • how it does what it does, and
  • that it does what it's supposed to do correctly.

In other words, consider a stabby lambda a sort of 'special pleading' - you're arguing that the function is so trivially true, defining it in a long and extensive way would benefit the code less than what is gained by the brevity of the stabby lambda syntax.

do blocks

do blocks are another form of anonymous functions. Similarly to stabby lambdas, they introduce a functional process that doesn't need to be defined by name. Let's consider the stabby lambda in the previous example and try to rewrite it as a do block:

    map([2, 3, 5, 7]) do x
        2x^3 + x^2 - 2x + 4
    end

The do block is a bit of syntactic sugar that helps us avoid unduly long stabby lambdas, as well as do slightly more complex things that the stabby lambda's restricted format might not allow for, such as more complex testing than a stabby lambda coupled with a ternary operator would allow:

    map([2, 3, 5, 7]) do x
        if mod(x, 3) == 0
            x^2 + 2x - 4
        elseif mod(x, 3) == 1
            2x^3 + x^2 - 2x + 4
        else
            2x-4
        end
    end

Returning multiple values

A function needs to return a single object, but that object may take the shape of a collection containing multiple values. If your function does return multiple values from within the function, they will be returned as a tuple:

    julia> function squares(x, y)
               return x^2, y^2
           end
    squares (generic function with 1 method)

    julia> squares(2,5)
    (4,25)

    julia> typeof(squares(2,5))
    (Int64,Int64)

As we can see, the function returned two values of type Int64, in a tuple. For various reasons, you may prefer defining your own type to return, such as a composite type - this is up to you and Julia gives you considerable freedom in doing so.

Scope in function evaluation

Scope in function evaluation refers to the availability of variables within or outside a function. Much of what has been said about scope in blocks in general applies here, but function evaluation has some peculiar quirks that are worth mentioning.

Lexical scoping

Julia implements lexical scoping, that is, the scope of a function is inherited not from its caller but its definition. Consider the following:

    function foo()
        println(x)
    end

    function bar()
        x = 2
        foo()
    end

    julia> bar()
    ERROR: x not defined
     in bar at none:3

This is not unexpected, since the assignment of x to 2 is 'not visible' to the function foo when it's called. In other words, the assignment of x is outside the scope of the function. Therefore, it does not see the variable's definition and this yields an undefined variable error.

Global variables

A variable defined in the global scope is available to all functions:

    julia> x = 2
    2

    julia> foo()
    2

While this is very helpful, global variables incur an immense performance penalty. Their use is generally discouraged unless absolutely necessary.

Higher order functions

In general, the idea of a higher order function serves to distinguish functions that accept a function as an argument from other functions, sometimes referred to as first-order functions. In functional programming, higher order functions are much more important than in OOP or other paradigms, and indeed even if you return to your OOP roots, an understanding of higher order functions will help you enormously in dealing with the implementations of higher order functions in your language of choice: since most higher-order functions are so useful for munging data, most programming languages do have implementations of map(), sort() and other archetypal higher order functions.

Functions that accept functions as arguments

We have already introduced map(), a typical higher-order function, above. While higher-order functions appear to be somewhat complex, they are actually easier than they seem. A function is an object like any other, and so can be fed into another function as an argument. You will not, generally, need to do anything special for your function to accept a function as an argument, except make sure you are calling the function provided to you.

    function greet(x)
        str = x()
        println("Hello, $str\!")
    end

    function tell_me_where_I_live()
        return("world")
    end

    julia> greet(tell_me_where_I_live)
    Hello, world!

Quite importantly, when you are passing a function to another function as an argument, you are not passing a call, you're passing the function object - so don't forget to skip the parentheses ()!

Operators and higher-order functions

Operators, such as +, are just clever aliases for functions. Thus, there is no reason why they couldn't be passed into a function:

    function oper(x, y, z)
        return x(y, z)
    end

    julia> oper(+, π, e)
    5.859874482048838

In this case, the operator + was fed into our function (which did nothing but execute the operator fed in as x on y and z).

Functions that return functions

Just as accepting functions is perfectly permissible, a function can return a function as a result. Consider a function that returns an exponential function based on your input as the exponent.

    function create_exponential_function(exponent)
        exp_func = function(x)
            return x^exponent
        end
        return exp_func
    end

    julia> power_of_five = create_exponential_function(5)
    (anonymous function)

    julia> power_of_five(5)
    3125

The function above can be written more concisely with the stabby lambda syntax we encountered earlier:

    function create_exponential_function(exponent)
        y ->  y^exponent
    end

Currying

Some languages, including some functional languages, support a feature called currying, named not after the Indian spice but after logician Haskell Curry (namesake of the Haskell language). A curried function is one that has multiple arguments. If it is provided with values for all of them, it returns a value. If it is provided with only part of them, it returns a function that takes the missing values as arguments.

Currying was proposed for Julia in 2012, but voted down, not least because it would have been difficult to accommodate within multiple dispatch.

Methods and multiple dispatch

Understanding multiple dispatch

When you call a function on a number of arguments, Julia needs to decide how exactly that function makes sense for those arguments. In this sense, functions are not so much names for individual functions but for bunches of conceptually similar functions, with Julia deciding which particular one to call. Consider the * operator (which, like all operators, is a function):

    julia> π * e
    8.539734222673566

    julia> "sausages " * "mash"
    "sausages mash"

As the example above shows, the * function can take various types, and it has various actions defined for each - for numeric types, this involves multiplication, while for strings, * means concatenation. The feature of Julia that allows the call of the right implementation of a function based on arguments is called multiple dispatch, and the implementations are referred to as methods. Each function may have a number of methods defined for various data types, and it may have no methods at all defined for some. Finally, the error message we get when we use the 'wrong' type of input starts to make sense:

    julia> 2 * "sausage"
    ERROR: `*` has no method matching *(::Int64, ::ASCIIString)
    Closest candidates are:
      *(::Number, ::Bool)
      *(::Int64, ::Int64)
      *(::Real, ::Complex{T<:Real})
      ...

What Julia is referring to in this instance is that * is not defined for one Int64 and one ASCIIString operator. In other words, the function * has no method defined that would take these two particular kinds, after which it then recommends various options (some fairly unexpected, for instance, ::Number * ::Bool is perfectly valid – it multiplies the ::Number by 1 if the ::Bool is true and 0 if it is false).

Building methods

To construct a method, you can simply declare the function for a particular data type. Let's consider a function that adds numbers and concatenates strings (for now, only two of each - the function can be expanded using the splat ... syntax easily).

    function merge_together(a::Number, b::Number)
        a + b
    end

This is great. It does a great job at adding up numbers:

    julia> merge_together(2, π)
    5.141592653589793

It's less adept at doing the string concatenation part we need it to do:

    julia> merge_together("Sausages with", " mash")
    ERROR: `merge_together` has no method matching merge_together(::ASCIIString, ::ASCIIString)

Therefore, we will need to define a method for merge_together() that will accept ASCIIString arguments. When Julia tells us a method is missing, it will give us the concrete data type of the argument we have entered. This is useful, but try to resist the temptation to define merge_together for ::ASCIIString. In general, if your use case relates not to the concrete type but to the broader, abstract type (such as ours, where our use case is really all strings, not just ASCIIString), it's good practice to use the broadest abstract type that will include only the data types that you need. In this case, it is not ASCIIString but its abstract ancestor, AbstractString (in case you forgot your handy inheritance dendrogram, you can look at the supertype of any type by using super(ASCIIString) with the name of the type you're interested in). Let's define merge_together for two ::AbstractString objects:

    function merge_together(a::AbstractString, b::AbstractString)
        a * b
    end

That's it, folks! Julia helpfully tells us that merge_together now has two methods. Using methods(merge_together), we can list these:

    julia> methods(merge_together)
    # 2 methods for generic function "merge_together":
    merge_together(a::Number,b::Number) at none:2
    merge_together(a::AbstractString,b::AbstractString) at none:2

Let's give the second one, for strings, a try:

    julia> merge_together("Sausages with", " mash")
    "Sausages with mash"

It works! In general, when creating a function, you need to be circumspect as to what you want to use it for and what it needs to be able to deal with. There is no need for a function to have methods for all data types. So far, we have generally not defined the data types of arguments. This is a bad practice, and when you are building functions, you should always think of yourself as building methods at the same time, and define the types you want your function to accept.

Call order and method ambiguities

Consider the following function f:

    function f(x)
        return x
    end

    function f(x::Int)
        return x^2
    end

The first definition, lacking a type restriction, is deemed by Julia to accept inputs of type Any - that is, any type. The second method, however, only takes inputs of type Int. As such, it is more specific (or, if you please, 'further downstream on the type dendrogram'). The result is that when you call f(2), the second, more specific method will be called, even if technically, the argument 2 would be acceptable for both. This is a sensible approach, since the broader the type, the more likely that the method is intended to be a 'catch-all' to mop up cases that have not been caught by any of the subtypes.

However, for functions with multiple arguments, it is possible that there is no unique method that is more unambiguous than the others. Consider the following:

    function g(x::Int, y)
        return 2x^2 - 2y
    end

    function g(x, y::Int)
        return 2x - 2y^2
    end

Which of these functions is 'more definite' when called as, say, g(6, 8)? The answer is 'neither', and Julia says so much when declaring the second method:

    Warning: New definition
        g(Any,Int64) at none:2
    is ambiguous with:
        g(Int64,Any) at none:2.
    To fix, define
        g(Int64,Int64)
    before the new definition.
    g (generic function with 2 methods)

A method g(x::Int64, y::Int64) will be more specific than either of the previously defined methods, and as such capable of dealing with the indefinite middle.

    function g(x::Int, y::Int)
        return x^2 - y^2
    end

Parametric methods

A parametric method, similar to parametric types, is one in which a logical relationship is asserted between types, rather than an actual type name. You may think of parameters as 'variables' for type assertions. The parameter - by convention, but not by necessity, T for type - is enclosed in curly braces {} and interposed between the function name and its arguments:

    function identical_types{T}(x::T, y::T)
        ...
    end

This function would accept arguments of the same type, regardless of what that type is. You can restrict the possible values T might take based on type hierarchy:

    function identical_numbers{T<:Number}(x::T, y::T)
        ...
    end

This function allows for any inputs that are both identical and descendants of the Number supertype. Contrast that with

    function divergent_numbers(x::Number, y::Number)
        ...
    end

which accepts inputs that are descendants of the Number supertype, regardless of whether their type matches or not.

Inspecting methods

Entering a function object into the REPL, but not calling the function object, will indicate the number of methods under the function:

    julia> +
    + (generic function with 150 methods)

You can inspect methods available under a function by using the method() command and passing the function or operator as argument:

    julia> methods(+)
    # 150 methods for generic function "+":
    +(x::Bool) at bool.jl:34
    +(x::Bool,y::Bool) at bool.jl:37
    ...
    +(a,b,c) at operators.jl:83
    +(a,b,c,xs...) at operators.jl:84

results matching ""

    No results matching ""