Quick Start · ParametricDFNOs.jl

Installation

Add ParametricDFNOs.jl as a dependency to your environment.

To add, either do:

julia> ]
(v1.9) add ParametricDFNOs

julia> using Pkg
julia> Pkg.activate("path/to/your/environment")
julia> Pkg.add("ParametricDFNOs")

Jump right in

To get started, you can also try running some examples

Setup

Make sure to include the right dependency you plan on using in your environment

using MPI
using CUDA

# If you plan on using the 2D Time varying FNO or 3D FNO.
using ParametricDFNOs.DFNO_2D

# If you plan on using the 3D Time varying FNO or 4D FNO.
using ParametricDFNOs.DFNO_3D

We also use PyPlot, so you would need to do:

python3 -m pip install matplotlib

MPI setup

MPI Distribution

Make sure you have a functional MPI Distribution set up

All code must be wrapped in:

MPI.Init()

### Code here ###

MPI.Finalize()

Change to custom use case

We show the usage for ParametricDFNOs.DFNO_2D but the extension to the other FNOs should be as simple as changing the number. Please refer to the API for exact differences.

GPU usage

Default behavior

By default, the package will be set to use the GPU based the whether the DFNO_2D_GPU flag was set during compile time of the package

You can set the GPU flag by using:

export DFNO_2D_GPU=1

and

global gpu_flag = parse(Bool, get(ENV, "DFNO_2D_GPU", "0"))
DFNO_2D.set_gpu_flag(gpu_flag)

Binding GPUs

If you wish to run on multiple GPUs, make sure the GPUs are binded to different tasks. The approach we choose in our examples is to unbind our GPUs on request and assign manually:

using CUDA

CUDA.device!(rank % 4)

which might be different if you have more or less than 4 GPUs per node.

Model Setup

Define a 2D Model configuration:

modelConfig = DFNO_2D.ModelConfig(nx=20, ny=20, nt=50, mx=4, my=4, mt=4, nblocks=4, partition=partition, dtype=Float32)

Define some random inputs to operate on:

input_size = (modelConfig.nc_in * modelConfig.nx * modelConfig.ny * modelConfig.nt) ÷ prod(partition)
output_size = input_size * modelConfig.nc_out ÷ modelConfig.nc_in

x = rand(modelConfig.dtype, input_size, 1)
y = rand(modelConfig.dtype, output_size, 1)

Initializing model

model = DFNO_2D.Model(modelConfig)
θ = DFNO_2D.initModel(model)

Forward and backward pass

See Simple 2D forward and gradient pass for a full example.

DFNO_2D.forward(model, θ, x)

Distributed Loss Function

We provide a distributed relative L2 loss but most distributed loss functions should be straight-forward to build with ParametricOperators.jl

To compute gradient:

using Zygote
using ParametricDFNOs.UTILS

gradient(params -> loss_helper(UTILS.dist_loss(DFNO_2D.forward(model, params, x), y)), θ)[1]

The number of in_channels you specify at Model Setup is data_channels + 3 for DFNO_2D and data_channels + 4 for DFNO_3D. This is to account for the grid data we include for each of the dimensions in the FNO.

out channels

Currently, distributed wrapper only supports reading for the case where out channel is 1. You can implement your own read function or wait for a version update

`DFNO_2D`

Here, indicies for the dist_read_x_tensor represents

(x_start:x_end, y_start:y_end, sample_start:sample_end)

and the indices for the dist_read_y_tensor represents:

(x_start:x_end, y_start:y_end, t_start:t_end, sample_start:sample_end)

`DFNO_3D`

Here, indicies for the dist_read_x_tensor represents

(x_start:x_end, y_start:y_end, z_start:z_end, sample_start:sample_end)

and the indices for the dist_read_y_tensor represents:

(x_start:x_end, y_start:y_end, z_start:z_end, t_start:t_end, sample_start:sample_end)

Now you can use loadDistData from 2D Data Loading or 3D Data Loading

This can also be extended to complex storage regime. Consider the following case:

samples/
├── sample1/
│   ├── inputs.jld2
│   └── outputs.jld2
└── sample2/
    ├── inputs.jld2
    └── outputs.jld2

We can do Custom 3D Time varying FNO

Training wrapper

We also provide a training wrapper to train out the box. See Training 2D Time varying FNO for a full example.

Define a 2D Training configuration:

trainConfig = DFNO_2D.TrainConfig(
    epochs=10,
    x_train=x_train,
    y_train=y_train,
    x_valid=x_valid,
    y_valid=y_valid,
    plot_every=1
)

And train using:

DFNO_2D.train!(trainConfig, model, θ)