What is DataWeave? Part II

what is dataweave part ii
ambassador logo
MuleSoft Ambassador Team
20 min read

We would like to thank MuleSoft Ambassador, Joshua Erney for his contribution to this developer tutorial.

Overview

This tutorial will cover variables, flow control, and Boolean operators. By the end of this tutorial, you will understand how to create and use variables, and how you can use if/else expressions. The tutorial will also provide an introduction to pattern matching, but a full explanation of the topic is outside of the scope of this tutorial and will be covered in a tutorial in the future. To begin the tutorial, don't forget to signup for an Anypoint Platform account.

Like other languages, DataWeave has variables so that you can store values to use later on in your script. Think of variables as a container for your data. The name you give to the variable is just like putting a label on the outside of the container so you can easily find it later. Variables are useful for assigning names to values and storing a calculation that would otherwise need to be repeated. To set a variable, use the following syntax:

var var_name = expression

An expression is something that returns a value or is a value itself. Here’s an example of setting a variable to an explicit value:

var name = “Max the Mule”

You can also use variables to store the results of calculations for later use:

 
 
 
Flow Control: If/else Expressions & Boolean Operators

Flow control is used when you want to execute certain parts of your code in some situations, while not executing others. In other words, it’s a way to add logic to your scripts.

A common use case for variables is to store the result of some kind of boolean operation. Think of a Boolean operation as an expression that returns some value if some criteria is met, and returns another value if the criteria is not met. The simplest Boolean operations use equality, relational, and logical operators. You might be familiar with these from other languages:

A > B

Greater than

A < B

Less than

A >= B

Greater than or equal to

A <= B

Less than or equal to

A == B

Equal to

A ~= B

Similar to*

!A

Logical negation**

not A

Logical negation**

A and B

Logical and

A or B

Logical or

* Tries to coerce two values of differing types to the same type and compare them (e.g., “1” ~= 1 would evaluate to true)
** Both of these operators perform logical negation (i.e. both !true and not true evaluate to false). However, ! has a very high precedence; it always operators first in a chain of logical operators. not, on the other hand, has a low precedence, it typically operates last in a chain of logical operators. For example !true or true evaluates to true, whereas not true or true evaluates to false because true or true is evaluated first to be true, then negated by not to be false.

For example, you might want to check if a number is greater than 100 so that you can inform a system whether or not to buy something. To do this, we will use the if/else expression, which is formatted like this:

if (<criteria_expression>) <return_if_true> else <return_if_false>

There are cases in DW where parentheses are optional, but it’s important to note the criteria must be surrounded by parentheses in if/else expressions. Here’s a concrete example:

 
 
 

If you’re familiar with popular languages like Java, or C#, you’ll notice the way DataWeave implements if/else is much closer to a ternary expression than the if/else statements you see in those languages. The difference is very simple, however. DW uses if/else expressions that returns values, Java and C# use if/else statements that do not return values.

if/else expressions are chainable, meaning you can do multiple checks before you return any data. Here’s the syntax for how that works:

if (<criteria_expression1>) 
  <return_if_true> 
else if (<criteria_expression2>)
  <return_if_true> 
else
  <return_if_no_other_match>

You can have as many of these if/else chains as necessary. Imagine you had a third option in addition to “buy” and “hold”, “sell”. You could chain if/else expressions together to account for these additional criteria :

 
 
 
Flow Control: Intro to Pattern Matching

Pattern matching is another method of flow control, but it does quite a bit more under the hood than the if/else expression does, and the syntax is a little more complicated. Like the if/else expression, pattern matching also returns a single value. Here’s a simplification of how pattern matching expressions are formatted:

<input_expression> match {
  case <condition> -> <expression>
  case <condition> -> <expression>
  else -> <expression>
}

The easiest way to understand basic pattern matching it to show an example:

 
 
 
DataWeave Functions

This tutorial will review how functions are used in DataWeave. It will cover how to create named functions and call them. It will also cover some of the tools DataWeave provides to make working with functions more convenient: lambdas, infix notation, and the dollar-sign syntax.

Functions are one of DataWeave’s most important tools. They allow us to conveniently reuse functionality and create functionality on the fly when reuse isn’t necessary. We create functions in the declarations section of the script using the fun keyword. This associates a set of functionality with a name. Here’s the basic syntax:

fun <function_name>([<arg1>], [<arg2>], …, [<argN>]) = <body>

It's good form to put the body on a new line and indent, like this:

fun sayHello(name) =
  "Hello " ++ name

Functions by themselves are values just like Strings and Numbers. But as you may have guessed, you can call functions to get them to execute their logic and return a value. You do so with the following syntax:

<function_name>(<arg1>, <arg2>, …, <argN>)

Here’s a simple example of creating a function and calling it:

 
 
 

Notice that there is no return keyword. A return keyword isn't needed because most everything in DataWeave is an expression, and all expressions return data.

DataWeave provides multiple ways to create functions. Just like we have named functions, we have functions without names, called lambdas. A lambda is a value in DataWeave, just like a String, an Object, and a Boolean. The syntax for a lambda is like so:

([<arg1>], [<arg2>], …, [<argN>]) -> body

Here’s how we might try to use a lambda in a DataWeave script:

 
 
 

You may be wondering why the output was not 5. Why didn’t the function execute and return 2 + 3? Remember that functions represent functionality and therefore so do lambdas. They don’t do anything unless you call them, they’re just values. The lambda never executes; it’s describing functionality, not executing it. Because of this, 5 is not passed to the writer, instead, DataWeave sends () -> 2 + 3 as a value (just like a String). In turn, the writer tries to serialize it as JSON, but JSON does not support functions, hence the error. 

How would we get the script to return 5? Recall that in order to call functions you need to use the following syntax:

<function_name>(<arg1>, <arg2>, …, <argN>)

But lambdas don’t have names, that’s the whole point! In order to force a lambda to execute, we  surround it in parentheses and append () to the end:

 
 
 

The pair of parentheses at the end of the lambda works the same as when you call a named function. In other words, those parentheses are where you pass arguments to the function if you have any:

 
 
 

If this syntax seems awkward, it’s because we’re using lambdas in a way they're not meant to be used. If you needed to accomplish the above script, you’d be better off having the body as 2 + 3. What are lambdas good for?

Because lambdas are values just like Strings, we can assign them to variables. This effectively gives a name to an unnamed function and allows us to use a lambda just like a normal function:

 
 
 

But that's not very useful either, we already have a nice syntax for the same thing with the fun keyword. The usefulness of lambdas becomes apparent when we combine two ideas:

  1. Lambdas are values just like Strings, Objects, and Booleans
  2. Values can be passed to functions as arguments, as well as returned from functions.

In other words, lambdas become useful when you want to pass functions as arguments to other functions, or return a function from another function. A function that does this is referred to as a higher-order function (HOF). HOFs are easier to understand once you’re familiar with how to use one. Here’s an example of using a HOF, filter, to make sure an Array only contains odd numbers:

 
 
 

filter takes two arguments, an Array and a Lambda. In situations like these, it’s important to specify what the lambda should do as well. In the case of filter, the lambda should take in two arguments: an item in the Array, and the index of that particular item. It should return a Boolean. This Boolean value is used to determine if a value makes it to the returned Array. It is the responsibility of the receiving function to pass arguments into the lambda you specified, and do something with the return value. Let’s dig into how filter works to gain a deeper understanding of HOFs.

filter uses the lambda on each item of the Array to determine whether it should be in the returned Array. If filter calls the lambda with 1, it returns true, so 1 makes it to the output Array. If filter calls the lambda with 2, it returns false, so 2 does not make it to the output Array. This goes on until the last item of the Array is reached, then the final Array is returned.

While the code above is ok, it’s a little inconvenient. We had to give a name to the function in order to use it, even though we were never going to use it again. This is exactly where lambdas come in. They enable use to pass functions to other functions without the inconvenience of having to think up a name for them:

 
 
 

So far, we’ve been calling filter using prefix notation. You’re likely familiar with prefix notation from languages like Java, JavaScript, and Python. With prefix notation the function name comes before the arguments. DataWeave supports another notation, infix notation. In DataWeave, if a function takes two arguments, like a filter, you can call it with infix notation. Infix notation has the following syntax:

<arg1> <function_name> <arg2>

This syntax is preferred for nearly every function that takes a lambda as its second argument. Here’s how the code above would look if we called filter using infix notation:

 
 
 

This may not seem like a great advantage, but it allows you to easily chain together functions that take in and return the same data type. For example, we can string together two filter functions in a way that is easy to read and understand:

 
 
 

In this case, the Array is filtered on whether or not it is odd, then filtered on whether or not the number is greater than 3. Notice the additional parentheses around the first lambda. The parenthesis around the lambdas help DataWeave determine that the second call to filter is not part of the first filter’s lambda expression.

HOFs are so prolific in DataWeave’s library that there are additional syntax features that make them easier to use. For functions that DataWeave provides, you can represent the first, second, and third arguments passed to the lambda as $, $$, and $$$, respectively. We’ll refer to this as the dollar-sign syntax. When you use the dollar-sign syntax, you do not need to specify the arguments of the lambda when you pass it to the function. Here’s our odd number filter example from earlier using the dollar-sign syntax:

 
 
 

The dollar-sign syntax gives us all the same functionality as when we reference something by its name. This means we can chain selectors and indexes right on the dollar-sign in order to query data:

 
 
 
Conclusion

Thanks for reading part 2 of our three part series on What is DataWeave. To continue your reading on what’s possible in DataWeave, Click here to continue reading. If you have any feedback on this article, please rate this tutorial below.

Try Anypoint Platform for free

Start free trial

Already have an account? Sign in.

Related tutorials