f64ad

Crates.io

| Documentation | API |

Introduction

This crate brings easy to use, efficient, and highly flexible automatic differentiation to the Rust programming language. Utilizing Rust's extensive operator overloading and expressive Enum features, f64ad can be thought of as a drop-in replacement for f64 that affords forward mode or backwards mode automatic differentiation on any downstream computation in Rust.

Key features

  • f64ad supports reverse mode or forward mode automatic differentiation
  • f64ad supports not just first derivatives, but also any higher order derivatives on any functions.
  • f64ad uses polymorphism such that any f64ad object can either be considered a derivative tracking variable or a standard f64 with very little overhead depending on your current use case. Thus, it is reasonable to replace almost all uses of f64 with f64ad, and in return, you'll be able
    to "turn on" derivatives with respect to these values whenever you need them.
  • The f64ad Enum type implements several useful traits that allow it to operate almost exactly as a standard f64. For example, it even implements the RealField and ComplexField traits, meaning it can be used in any nalgebra or ndarray computations.
  • Certain functions can be pre-computed and locked to boost performance at run-time.

Crate structure

This crate is a cargo workspace with two member crates: (1) f64ad_core; and (2) f64ad_core_derive. All core implementations for f64ad can be found in f64ad_core. The f64ad_core_derive is currently a placeholder and will be used for procedural macro implementations.

Getting Started

In this Chapter, I will overview how to set up f64ad.

f64ad Installation

The f64ad code lives on github. A standard way to install this code is to ensure that git is installed on your system, then clone the code by running the following command in your terminal after navigating to a desired directory:

git clone https://github.com/djrakita/f64ad.git

And that's it! The f64ad files are now on your computer. In the following section, I will outline how to install the Rust toolchain in order to further set up the Optima Toolbox.

Rust Installation

We will next ensure that the Rust programming language toolchain is installed on your computer. Rust is required in order to set up and compile all of the features offered by Optima, even if you are mostly interested in using Optima in Python, Webassembly, etc.

Fortunately, the Rust toolchain is very easy to install. Simply follow the one-step installation guide here. To verify that Rust has been successfully installed, run the following command:

cargo --version

Also, to verify that the newly installed Rust compiler works with f64ad, navigate to the f64ad directory, then run

cargo test

Numerous tests should run, all passing with a result of ok.

Citing f64ad

If you use any part of the f64ad library in your research, please cite the software as follows:

 @misc{rakita_2022, url={https://djrakita.github.io/f64ad/}, 
 author={Rakita, Daniel}, 
 title={f64ad: Efficient and Flexible Automatic Differentiation in Rust}
 year={2022}} 

Examples

In this Chapter, I will go through some examples that use f64ad.

Example 1: Univariate Autodiff

use f64ad_core::ComplexField;
use f64ad_core::f64ad::{GlobalComputationGraphs};

fn main() {
    // Create a computation graph.
    let computation_graph = GlobalComputationGraphs::get(None, None);

    // Spawn an f64ad_ variable with a value of 2.
    let v = computation_graph.spawn_variable(2.0);

    // You can now use an f64ad_ exactly the same as you would use a standard f64.  In this example,
    // we are just using the `powi` function to take v to the third power.
    let result = v.powi(3);
    println!("Result of v.powi(3): {:?}", result);

    // We can now find the derivative of our just computed function with respect to our input variable,
    // `v`.

    // We can do this in one of two ways.  First, we can use backwards mode autodiff, meaning we
    // call `backwards_mode_grad` on our output result wrt our input variable, `v`:
    let backwards_mode_derivative = result.backwards_mode_grad(false).wrt(&v);

    // Alternatively, we can use forward mode autodiff, meaning we call `forward_mode_grad` on
    // our input variable `v` wrt to our output variable, `result`.
    let forward_mode_derivative = v.forward_mode_grad(false).wrt(&result);

    // Both methods will output the same derivative.
    println!("Backwards mode derivative: {:?}", backwards_mode_derivative);
    println!("Forward mode derivative: {:?}", forward_mode_derivative);
}

Output

Result of v.powi(3): f64ad_var_f(f64ad_var_f{ value: 8.0, node_idx: 1 })
Backwards mode derivative: f64(12.0)
Forward mode derivative: f64(12.0)

Example 2: Backwards Mode Multivariate Autodiff

use f64ad_core::ComplexField;
use f64ad_core::f64ad::{GlobalComputationGraphs};

fn main() {
    // Create a computation graph.
    let computation_graph = GlobalComputationGraphs::get(None, None);

    // Spawn an f64ad_ variables from computation graph.
    let v0 = computation_graph.spawn_variable(2.0);
    let v1 = computation_graph.spawn_variable(4.0);
    let v2 = computation_graph.spawn_variable(6.0);
    let v3 = computation_graph.spawn_variable(8.0);

    // compute some result using our variables
    let result = v0.sin() * v1 + 5.0 * v2.log(v3);
    println!("Result: {:?}", result);

    // compute derivatives in backwards direction from result.  Using backwards mode automatic
    // differentiation makes sense in this case because our number of outputs (1) is less than
    // our number of input variables (4).
    let derivatives = result.backwards_mode_grad(false);

    // access derivatives for each input variable from our `derivatives` object.
    let d_result_d_v0 = derivatives.wrt(&v0);
    let d_result_d_v1 = derivatives.wrt(&v1);
    let d_result_d_v2 = derivatives.wrt(&v2);
    let d_result_d_v3 = derivatives.wrt(&v3);

    // print results
    println!("d_result_d_v0: {:?}", d_result_d_v0);
    println!("d_result_d_v1: {:?}", d_result_d_v1);
    println!("d_result_d_v2: {:?}", d_result_d_v2);
    println!("d_result_d_v3: {:?}", d_result_d_v3);
}

Output

Result: f64ad_var_f(f64ad_var_f{ value: 7.9454605418379876, node_idx: 8 })
d_result_d_v0: f64(-1.6645873461885696)
d_result_d_v1: f64(0.9092974268256817)
d_result_d_v2: f64(0.40074862246915655)
d_result_d_v3: f64(-0.25898004032460736)

Example 3: Forward Mode Multivariate Autodiff

use f64ad_core::ComplexField;
use f64ad_core::f64ad::{GlobalComputationGraphs};

fn main() {
    // Create a computation graph.
    let computation_graph = GlobalComputationGraphs::get(None, None);

    // Spawn an f64ad_ variable with a value of 2.
    let v = computation_graph.spawn_variable(2.0);

    // compute some results using our variable
    let result0 = v.sin();
    let result1 = v.cos();
    let result2 = v.tan();
    println!("Result0: {:?}", result0);
    println!("Result1: {:?}", result1);
    println!("Result2: {:?}", result2);

    // compute derivatives in forward direction from v.  Using forward mode automatic
    // differentiation makes sense in this case because our number of outputs (3) is greater than
    // our number of input variables (1).
    let derivatives = v.forward_mode_grad(false);

    // access derivatives for each input variable from our `derivatives` object.
    let d_result0_d_v = derivatives.wrt(&result0);
    let d_result1_d_v = derivatives.wrt(&result1);
    let d_result2_d_v = derivatives.wrt(&result2);

    // print results
    println!("d_result0_d_v: {:?}", d_result0_d_v);
    println!("d_result1_d_v: {:?}", d_result1_d_v);
    println!("d_result2_d_v: {:?}", d_result2_d_v);

Output

Result0: f64ad_var_f(f64ad_var_f{ value: 0.9092974268256817, node_idx: 1 })
Result1: f64ad_var_f(f64ad_var_f{ value: -0.4161468365471424, node_idx: 2 })
Result2: f64ad_var_f(f64ad_var_f{ value: -2.185039863261519, node_idx: 3 })
d_result0_d_v: f64(-0.4161468365471424)
d_result1_d_v: f64(-0.9092974268256817)
d_result2_d_v: f64(5.774399204041917)

Example 4: Polymorphism

use f64ad_core::f64ad::{GlobalComputationGraphs, f64ad};

// f64ad_ is an enum here that is a drop-in replacement for f64.  It can track derivative information
// for both, either, or neither of the variables, you can select what you want depending on your
// application at the time.
fn f64ad_test(a: f64ad, b: f64ad) -> f64ad {
    return a + b;
}

fn main() {
    let computation_graph = GlobalComputationGraphs::get(None, None);
    let a = computation_graph.spawn_variable(1.0);
    let b = computation_graph.spawn_variable(2.0);

    // Compute result using two f64ad_ variables that track derivative information for both `a` and `b'.
    let result1 = f64ad_test(a, b);
    println!("result 1: {:?}", result1);

    ////////////////////////////////////////////////////////////////////////////////////////////////

    computation_graph.reset();
    let a = computation_graph.spawn_variable(1.0);

    // Compute result using one f64ad_ variables that only tracks derivative information for `a'.
    let result2 = f64ad_test(a, f64ad::f64(2.0));
    println!("result 2: {:?}", result2);

    ////////////////////////////////////////////////////////////////////////////////////////////////

    // Compute result using zero f64ad_ variables.  This operation will not keep track of derivative information
    // for any variable and will essentially run as normal f64 floats with almost no overhead.
    let result3 = f64ad_test(f64ad::f64(1.0), f64ad::f64(2.0));
    println!("result 3: {:?}", result3);
}

Output

result 1: f64ad_var_f(f64ad_var_f{ value: 3.0, node_idx: 2 })
result 2: f64ad_var_f(f64ad_var_f{ value: 3.0, node_idx: 1 })
result 3: f64(3.0)

Example 5: Univariate Higher Order Derivatives

use f64ad_core::ComplexField;
use f64ad_core::f64ad::{GlobalComputationGraphs};

fn main() {
    // Create a computation graph.
    let computation_graph = GlobalComputationGraphs::get(None, None);

    // Spawn an f64ad_ variables from computation graph.
    let v = computation_graph.spawn_variable(2.0);

    let result = v.powi(5);
    println!("Result: {:?}", result);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // first derivative computations...
    // we must set the parameter `add_to_computation_graph` to `true`
    let derivatives = result.backwards_mode_grad(true);
    let d_result_d_v = derivatives.wrt(&v);
    println!("d_result_d_v: {:?}", d_result_d_v);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // second derivative computations...
    // we must set the parameter `add_to_computation_graph` to `true`
    let derivatives = d_result_d_v.backwards_mode_grad(true);
    let d2_result_d_v2 = derivatives.wrt(&v);
    println!("d2_result_d_v2: {:?}", d2_result_d_v2);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // third derivative computations...
    // we must set the parameter `add_to_computation_graph` to `true`
    let derivatives = d2_result_d_v2.backwards_mode_grad(true);
    let d3_result_d_v3 = derivatives.wrt(&v);
    println!("d3_result_d_v3: {:?}", d3_result_d_v3);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // fourth derivative computations...
    // we must set the parameter `add_to_computation_graph` to `true`
    let derivatives = d3_result_d_v3.backwards_mode_grad(true);
    let d4_result_d_v4 = derivatives.wrt(&v);
    println!("d4_result_d_v4: {:?}", d4_result_d_v4);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // fifth derivative computations...
    // we must set the parameter `add_to_computation_graph` to `true`
    let derivatives = d4_result_d_v4.backwards_mode_grad(true);
    let d5_result_d_v5 = derivatives.wrt(&v);
    println!("d5_result_d_v5: {:?}", d5_result_d_v5);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // sixth derivative computations...
    // we must set the parameter `add_to_computation_graph` to `true`
    let derivatives = d5_result_d_v5.backwards_mode_grad(true);
    let d6_result_d_v6 = derivatives.wrt(&v);
    println!("d6_result_d_v6: {:?}", d6_result_d_v6);
    println!("////////////////////////////////////////////////////////////////////////////////////");
}

Output

Result: f64ad_var_f(f64ad_var_f{ value: 32.0, node_idx: 1 })
////////////////////////////////////////////////////////////////////////////////////
d_result_d_v: f64ad_var_f(f64ad_var_f{ value: 80.0, node_idx: 5 })
////////////////////////////////////////////////////////////////////////////////////
d2_result_d_v2: f64ad_var_f(f64ad_var_f{ value: 160.0, node_idx: 13 })
////////////////////////////////////////////////////////////////////////////////////
d3_result_d_v3: f64ad_var_f(f64ad_var_f{ value: 240.0, node_idx: 29 })
////////////////////////////////////////////////////////////////////////////////////
d4_result_d_v4: f64ad_var_f(f64ad_var_f{ value: 240.0, node_idx: 61 })
////////////////////////////////////////////////////////////////////////////////////
d5_result_d_v5: f64ad_var_f(f64ad_var_f{ value: 120.0, node_idx: 125 })
////////////////////////////////////////////////////////////////////////////////////
d6_result_d_v6: f64ad_var_f(f64ad_var_f{ value: 0.0, node_idx: 253 })
////////////////////////////////////////////////////////////////////////////////////

Example 6: Multivariate Higher Order Derivatives

use f64ad_core::ComplexField;
use f64ad_core::f64ad::{GlobalComputationGraphs};

fn main() {
    // Create a computation graph.
    let computation_graph = GlobalComputationGraphs::get(None, None);

    // Spawn an f64ad_ variables from computation graph.
    let v0 = computation_graph.spawn_variable(2.0);
    let v1 = computation_graph.spawn_variable(4.0);

    let result = v0.powf(v1);
    println!("Result: {:?}", result);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // first derivative computations...
    let derivatives = result.backwards_mode_grad(true);

    let d_result_d_v0 = derivatives.wrt(&v0);
    let d_result_d_v1 = derivatives.wrt(&v1);
    println!("d_result_d_v0: {:?}", d_result_d_v0);
    println!("d_result_d_v0: {:?}", d_result_d_v1);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // second derivative computations...
    let derivatives2_d_result_d_v0 = d_result_d_v0.backwards_mode_grad(true);
    let derivatives2_d_result_d_v1 = d_result_d_v1.backwards_mode_grad(true);

    let d2_result_dv0v0 = derivatives2_d_result_d_v0.wrt(&v0);
    let d2_result_dv0v1 = derivatives2_d_result_d_v0.wrt(&v1);
    let d2_result_dv1v0 = derivatives2_d_result_d_v1.wrt(&v0);
    let d2_result_dv1v1 = derivatives2_d_result_d_v1.wrt(&v1);
    println!("d2_result_dv0v0: {:?}", d2_result_dv0v0);
    println!("d2_result_dv0v1: {:?}", d2_result_dv0v1);
    println!("d2_result_dv1v0: {:?}", d2_result_dv1v0);
    println!("d2_result_dv1v1: {:?}", d2_result_dv1v1);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // etc...
}

Output

Result: f64ad_var_f(f64ad_var_f{ value: 16.0, node_idx: 2 })
////////////////////////////////////////////////////////////////////////////////////
d_result_d_v0: f64ad_var_f(f64ad_var_f{ value: 32.0, node_idx: 10 })
d_result_d_v0: f64ad_var_f(f64ad_var_f{ value: 11.090354888959125, node_idx: 12 })
////////////////////////////////////////////////////////////////////////////////////
d2_result_dv0v0: f64ad_var_f(f64ad_var_f{ value: 48.0, node_idx: 54 })
d2_result_dv0v1: f64ad_var_f(f64ad_var_f{ value: 30.18070977791825, node_idx: 56 })
d2_result_dv1v0: f64ad_var_f(f64ad_var_f{ value: 30.18070977791825, node_idx: 238 })
d2_result_dv1v1: f64ad_var_f(f64ad_var_f{ value: 7.687248222691222, node_idx: 240 })
////////////////////////////////////////////////////////////////////////////////////

Example 7: Jacobian

use f64ad_core::ComplexField;
use f64ad_core::f64ad::{f64ad_jacobian, GlobalComputationGraphs};

fn main() {
    // Create a computation graph.
    let computation_graph = GlobalComputationGraphs::get(None, None);

    // Spawn an f64ad_ variables from computation graph.
    let v0 = computation_graph.spawn_variable(2.0);
    let v1 = computation_graph.spawn_variable(4.0);

    let result1 = v0.powf(v1);
    let result2 = v1.log(v0);
    println!("Result1: {:?}", result1);
    println!("Result2: {:?}", result2);
    println!("////////////////////////////////////////////////////////////////////////////////////");

    // Computes the first order jacobian and prints the summary
    let first_order_jacobian = f64ad_jacobian(&[v0, v1], &[result1, result2], 1);
    first_order_jacobian.print_summary();

    println!("////////////////////////////////////////////////////////////////////////////////////");

    // Computes the fourth order jacobian and prints the summary
    let higher_order_jacobian = f64ad_jacobian(&[v0, v1], &[result1, result2], 4);
    higher_order_jacobian.print_summary();
}

Output

Result1: f64ad_var_f(f64ad_var_f{ value: 16.0, node_idx: 2 })
Result2: f64ad_var_f(f64ad_var_f{ value: 2.0, node_idx: 3 })
////////////////////////////////////////////////////////////////////////////////////
0 --- output: 0, inputs wrt: [0], value: 32.0
1 --- output: 0, inputs wrt: [1], value: 11.090354888959125
2 --- output: 1, inputs wrt: [0], value: -1.4426950408889634
3 --- output: 1, inputs wrt: [1], value: 0.36067376022224085
////////////////////////////////////////////////////////////////////////////////////
0 --- output: 0, inputs wrt: [0, 0, 0, 0], value: 24.0
1 --- output: 0, inputs wrt: [0, 0, 0, 1], value: 85.27106466687738
2 --- output: 0, inputs wrt: [0, 0, 1, 0], value: 85.27106466687738
3 --- output: 0, inputs wrt: [0, 0, 1, 1], value: 69.8779867794306
4 --- output: 0, inputs wrt: [0, 1, 0, 0], value: 85.27106466687738
5 --- output: 0, inputs wrt: [0, 1, 0, 1], value: 69.8779867794306
6 --- output: 0, inputs wrt: [0, 1, 1, 0], value: 69.8779867794306
7 --- output: 0, inputs wrt: [0, 1, 1, 1], value: 22.187661197682576
8 --- output: 0, inputs wrt: [1, 0, 0, 0], value: 85.27106466687738
9 --- output: 0, inputs wrt: [1, 0, 0, 1], value: 69.8779867794306
10 --- output: 0, inputs wrt: [1, 0, 1, 0], value: 69.8779867794306
11 --- output: 0, inputs wrt: [1, 0, 1, 1], value: 22.187661197682576
12 --- output: 0, inputs wrt: [1, 1, 0, 0], value: 69.8779867794306
13 --- output: 0, inputs wrt: [1, 1, 0, 1], value: 22.187661197682576
14 --- output: 0, inputs wrt: [1, 1, 1, 0], value: 22.187661197682576
15 --- output: 0, inputs wrt: [1, 1, 1, 1], value: 3.693361577329335
16 --- output: 1, inputs wrt: [0, 0, 0, 0], value: 33.31458966591519
17 --- output: 1, inputs wrt: [0, 0, 0, 1], value: -1.5053751004845806
18 --- output: 1, inputs wrt: [0, 0, 1, 0], value: -1.5053751004845806
19 --- output: 1, inputs wrt: [0, 0, 1, 1], value: -0.12635828742686595
20 --- output: 1, inputs wrt: [0, 1, 0, 0], value: -1.5053751004845806
21 --- output: 1, inputs wrt: [0, 1, 0, 1], value: -0.12635828742686595
22 --- output: 1, inputs wrt: [0, 1, 1, 0], value: -0.12635828742686595
23 --- output: 1, inputs wrt: [0, 1, 1, 1], value: -0.03252139032821262
24 --- output: 1, inputs wrt: [1, 0, 0, 0], value: -1.505375100484581
25 --- output: 1, inputs wrt: [1, 0, 0, 1], value: -0.1263582874268659
26 --- output: 1, inputs wrt: [1, 0, 1, 0], value: -0.12635828742686595
27 --- output: 1, inputs wrt: [1, 0, 1, 1], value: -0.032521390328212586
28 --- output: 1, inputs wrt: [1, 1, 0, 0], value: -0.12635828742686595
29 --- output: 1, inputs wrt: [1, 1, 0, 1], value: -0.0325213903282126
30 --- output: 1, inputs wrt: [1, 1, 1, 0], value: -0.03252139032821263
31 --- output: 1, inputs wrt: [1, 1, 1, 1], value: -0.033813165020835076