Skip to main content

Rust Triton Client

I’ve recently tried to improve model deployement and test different approaches. After trying Tensorflow Serving and Torch Serve, I decided to take a look at Nvidia Triton. Its high performance and multiple model backends is very appealing. However I wanted to integrate it to my rust backend stack. Therefore, I decided to implement a rust version of the GRPC Client.

Triton Client

Proto files can compiled into different programming languages. In Rust, prost can be used to generate simple Rust code from proto files. Which can also be used with tonic to write production ready code that uses gRPC.

Retreiving the Triton Server Protos

After creating a new rust project, we can retreive the proto files defined by nvidia using git submodules:

git submodule add git@github.com:triton-inference-server/common.git

The protos are defined in /common/protobuf/.

The advantages of using a submodule is if the code is updated by nvidia we will spend less time to update our dependencies.

Generating Rust Code

To generate rust code from these protos we will need to add these dependencies:

cargo add prost
cargo add tonic
cargo add --build tonic-build

We then need to write some code in the /build.rs file:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let pb_dir: std::path::PathBuf = env::var("TRITON_PROTOBUF")
        .ok()
        .unwrap_or_else(|| concat!(env!("CARGO_MANIFEST_DIR"), "/common/protobuf").to_string())
        .into();

    let protobuf_paths: Vec<_> = ["grpc_service.proto", "health.proto", "model_config.proto"]
        .map(|protoname| pb_dir.join(protoname))
        .to_vec();

    tonic_build::configure()
        .build_server(true)
        .compile(&protobuf_paths, &[&pb_dir])?;

    Ok(())
}

This part will read from the environement the source directory of the triton protos and return a path:

let pb_dir: std::path::PathBuf = env::var("TRITON_PROTOBUF")
    .ok()
    .unwrap_or_else(|| concat!(env!("CARGO_MANIFEST_DIR"), "/common/protobuf").to_string())
    .into();

We then need to get the complete path of each protos we want to generate rust code:

let protobuf_paths: Vec<_> = ["grpc_service.proto", "health.proto", "model_config.proto"]
    .map(|protoname| pb_dir.join(protoname))
    .to_vec();

And finally call tonic_build to generate the rust code the defined output directory:

tonic_build::configure()
    .build_server(true)
    .compile(&protobuf_paths, &[&pb_dir])?;

and call

cargo build

Using the generated code

Now that we have generated the rust code from our proto files we can include it in our lib.rs as a mod:

pub mod triton {
    include!(concat!(env!("OUT_DIR"), "/inference.rs"));
}

And start sending gRPC messages to the Triton Inference Server with tokio-rs:

let url = env::var("TRITON_HOST").ok().unwrap_or("http://localhost:8001");
let mut client = GrpcInferenceServiceClient::connect(url.into()).await.unwrap();
let response = client.server_live(ServerLiveRequest {}).await.unwrap();
println!("{:?}", response.into_inner()) // OK => the server is live :D

Improvements

We can improve the current code by implementing wrappers on the GrpcInferenceServiceClient and the different messages such as builders.