What I learned about WDL
What is WDL?#
The Workflow Description Language (WDL) is an open-source language designed to simplify complex computational workflows, particularly in genomics and bioinformatics. Developed by the Broad Institute and maintained by the OpenWDL community, WDL allows scientists to define analysis pipelines in a human-readable format.
Why It Matters:
- Standardizes workflow definitions across platforms
- Enables reproducibility in scientific research
- Simplifies scaling from laptops to cloud environments
WDL Syntax: A Step-by-Step Breakdown#
Let’s dissect a simple WDL workflow to understand its structure.
Example Workflow#
version 1.2 # Modern WDL version
task say_hello {
input {
String greeting
String name
}
command <<<;
echo "~{greeting}, ~{name}!"
>>>;
output {
String message = read_string(stdout())
}
requirements {
container: "ubuntu:latest"
}
}
workflow main {
input {
String name
Boolean is_pirate = false
}
Array[String] greetings = select_all([
"Hello",
"Hallo",
"Hej",
(
if is_pirate
then "Ahoy"
else None
),
])
scatter (greeting in greetings) {
call say_hello {
input:
greeting = greeting,
name = name
}
}
output {
Array[String] messages = say_hello.message
}
}
Line-by-Line Breakdown#
1. Version Declaration#
version 1.2
- New in 1.2: Adds features like
select_all()
and improved error handling. - Best Practice: Always declare versions explicitly for compatibility.
2. Task Definition#
task say_hello {
input {
String greeting
String name
}
- Input Parameters: Declares two required inputs for personalization.
- Flexibility: Tasks can be reused across workflows with different inputs.
3. Command Section (Heredoc Syntax)#
command >>>;
echo "~{greeting}, ~{name}!"
<<<;
<<<;
Syntax: Allows multi-line commands without escaping quotes.- Variable Substitution:
~{}
injects WDL variables into shell commands.
4. Output Declaration#
output {
String message = read_string(stdout())
}
read_string()
: Built-in function captures command output.- Type Safety: Explicit
String
type ensures data consistency.
5. Runtime Requirements#
requirements {
container: "ubuntu:latest"
}
- Reproducibility: Uses Docker containers for consistent environments.
- Alternatives: Can specify CPU/memory constraints instead.
6. Workflow Inputs#
input {
String name
Boolean is_pirate = false
}
- Default Values:
is_pirate
is optional (defaults tofalse
). - Runtime Flexibility: Users can override defaults when executing.
7. Array with Conditional Logic#
Array[String] greetings = select_all([
"Hello",
"Hallo",
"Hej",
(if is_pirate then "Ahoy" else None),
])
select_all()
: Filters outNone
values, creating a clean array.- Conditional Expression: Adds “Ahoy” only if
is_pirate
istrue
.
8. Parallel Execution#
scatter (greeting in greetings) {
call say_hello { input: greeting, name }
}
- Scatter/Gather: Runs
say_hello
in parallel for each greeting. - Cloud Optimization: Automatically scales on distributed systems.
9. Workflow Output#
output {
Array[String] messages = say_hello.message
}
- Aggregation: Collects outputs from all parallel tasks.
- Downstream Use: These messages could feed into another workflow.
Key WDL 1.2 Features Demonstrated#
- Conditional Arrays:
(if is_pirate then "Ahoy" else None)
- Enables dynamic workflow configurations based on inputs.
- Scatter Parallelization:
scatter (greeting in greetings) { ... }
- Simplifies parallel processing of large datasets.
- Type-Safe Outputs:
Array[String] messages = say_hello.message
- Ensures data integrity between workflow steps.
Running the Workflow#
Input JSON:
{
"main.name": "Dave",
"main.is_pirate": true
}
Expected Output:
{
"main.messages": ["Hello, Dave!", "Hallo, Dave!", "Hej, Dave!", "Ahoy, Dave!"]
}
Getting Started with WDL#
- Install a WDL Runner:
# For MiniWDL (used in our project):
pip install miniwdl
- Write Your First Workflow:
version 1.0
workflow hello_wdl {
call say_hello
output {
String message = "Workflow completed!"
}
}
task say_hello {
command { echo "Hello, WDL!" }
}
- Run It:
miniwdl run hello.wdl
Conclusion#
By learning WDL, I gained the foundation needed to build tools around WDL, which aims to make workflow development even more intuitive.
Resources:
- OpenWDL Documentation -> The documentation is really good to understand WDL.
- WDL Quickstart Guide
Read other posts