Rust for Linux Basics: Handling Function Pointers

2023-10-09 (Mon) ; by ~lymkwi (Lux Amelia) ; 1387 words ; about 7 minutes ;

Contents

Part of the Rust for Linux Basics series: Function pointers are often essential in kernel API. Combining their use in Rust and the need to interface with the C ABI however, creates problems. Here is an explanation of the current popular solution.

Note: I originally wrote this as a message in the RFL Zulip chat answering a question from someone who asked about ways to handle function points in Rust for Linux, especially in the context of net, where we do it quite often. This blog post is a slightly refined version of that message, where I got carried away and essentially infodumped on the poor fella. I am a silly fox 🦊

Fun fact! A while back I originally planned to write a blog post about the very process of me figuring this whole thing out again from scratch, which I accidentally published for a couple days while unfinished. Oops.

Screenshot of my unpublished post: Rust FFI Callbacks for the Kernel at Compile Time

Here's a little explainer on how I handled pointers. It's not far from what I have seen others do, and there might be things that are considered to be common patterns/good practices that I sidestepped/skipped. I'm just learning. Thankfully, I had to explain this very thing many times to a handful of people, so it should be somewhat structured.

I'll take the example of my NAPI structure, as a way of implementing one function pointer needed in a structure. From my experience of moving other function pointers around (like struct net_device's priv_destructor, in groups/descriptors such as RTNL, and such), it doesn't really change too much.

It's going to be a lot of things you might already know or find obvious because I prefer being thorough and verbose rather than accidentally missing something critical, and it might be useful for people coming at this with very little familiarity with the project.

The general idea is going to be:

Create a trait definition marked #[vtable] that contains the Rust function(s) you want to implement, with all types in the return value and arguments replaced with their Rust counterparts
Implement that trait on a structure that will define your function for your use case
Have a second type that acts like a factory, and is generic on any type that implements your trait
The factory type is going to contain the static definition of the final pointer/series of pointers you will use, it decides which ones are defined thanks to booleans created at the implementation step of a #[vtable] trait
The functions being pointed to will wrap around your actual implementation, receiving arguments with binding types, and converting them, hiding away all of the unsafe {} from the driver user, before calling your implementation with the abstracted arguments. It does the reverse operation for any return value your Rust implementation gives
When you need the pointer in Rust code, you can call a method that returns a reference to the static stored for the specific version of your factory type for the trait of the implementer, and have an abstraction wrapper set it

Step 0: Prerequisites

You should identify the types of your function arguments and return type on the C side. Prepare abstractions for those if not already. For the case of napi polling, you get a struct napi_struct* and an int, and return an int. Because I was already writing an abstraction for using struct napi_struct, I was all good, and could convert back and forth from struct napi_struct* and &Napi/&mut Napi. That return int is not meant to be signaling errors as far as I know, only the number of packets received, so the stack can know how many packets were received in that poll run.

Step 1: Making a Trait

The #[vtable] macro was created for these kinds of scenarios where you need to have a sort of "table" of function definitions (on the Rust side, on the C side it's pointers), and know which ones have been defined by an implementation.

Say I have:

/// NAPI poller definition trait
#[vtable]
pub trait NapiPoller {
    /// Implementation of the poll function
    fn poll(_napi: &mut Napi, _budget: i32) -> i32 {
        0 // Not implemented
    }
}

(actual version here)

Step 2: Make a builder

The builder/factory is going to hold your pointer definition(s) and give you static methods to retrieve it. In my case I even defined the exact type of the poller function as a type because it was just a bit too annoying to type out in its entirety.

/// Building structure for poller functions
pub struct PollerBuilder<T: NapiPoller> {
    _p: PhantomData<T>,
}

type PollerFunction = unsafe extern "C" fn(*mut bindings::napi_struct, i32) -> i32;

You've probably seen it already, but function pointers on the Rust side are Option<unsafe extern "C" fn(A....) -> B>, and we keep them at None if nothing is provided.

The builder implementation is valid for any T: NapiPoller, and it holds my static function definition:

impl<T: NapiPoller> PollerBuilder<T> {
    const FUNC: Option<PollerFunction> = <T>::HAS_POLLER.then_some(Self::poller_callback);

    /// Access the function pointer
    pub const fn build_function() -> Option<PollerFunction> {
        Self::FUNC
    }

    unsafe extern "C" fn poller_callback(napi: *mut bindings::napi_struct, budget: i32) -> i32 {
        /// SAFETY: <cut for brevity>
        let napi: &mut Napi = unsafe { &mut *napi.cast() };

        // The rest is primitive, hence, trivial
        <T>::poll(napi, budget)
    }
}

At the time I did my version, I forgot to check <T>::HAS_POLL, and bool::then_some was only recently stable, and I did not use it.

Step 3: Implement in user and abstractions

Let's say I have my drive, and it needs to have a definition of NAPI polling. You put whatever you want in it. As a fun thought experiment, if you do not provide an implementation of poll, you can imagine that the factory structure will contain None in its const field, and that your assignment will be to a null function pointer. That fits the semantics of what we want if we do not define a field in a descriptor/set a function pointer to null from a lack of implementation.

Oh and your implementation also has to be marked #[vtable] so it can process it and add the booleans indicating whether you overwrote functions or not. I really wish we had some way to do that in the core language, but I think we're in such a specific use case that it might never come to core.

Now, for my poller, I needed an additional function that sets the poll field of struct napi_struct through my abstraction, so I wrote the wrapper for it (in my net::Device, which I guess made sense at the time), and this neat little trick I found helps me pass the trait implementer type for my implementation of poll to the function:

pub fn napi_add<POLLER: gro::NapiPoller>(&mut self, napi: &mut gro::Napi) {
    // ... conversion stuff
    // Use our factory type on the type provided to retrieve its implementation as a const pointer
    // that way we can pass it around on the C side and it will remain valid, and hide all of the
    // wrapping from the Rust user
    let poll_func = gro::PollerBuilder::<POLLER>::build_function();

    /// SAFETY: ....
    unsafe { bindings::netif_napi_add(dev_ptr, napi_ptr, poll_func) };

And in my driver:

dev.napi_add::<MyNapiPoller>(&mut napi);

Limitations

This is only an example but it outlines the general steps I undertook for RTNL, NAPI, and various other function pointers. I got them from looking at code others made, and things like #[vtable] make me think this general structure is intended to be used, at least for now.

In terms of scaling? I haven't evaluated it, but I assumed that changing function pointers on the fly is going to be a bit of a bother. Hopefully you do not actually need to have twenty of fifty different implementations, and you only have to implement one trait in the user. On the abstraction side, however, it gets problematic. If you are talking about descriptors/structures of many pointers, doing housekeeping to keep the names up to date is necessary (I believe this has been discussed in the recent proposal in netdev). Furthermore, it's still a lot of boilerplate code for one function pointer, and if you do not have any backing structure, you may need to spend a lot of time making structures that are needed just to handle that function pointer.

Ah and don't forget: my code hasn't been reviewed by people on here. Handle it with an amount of salt typically reserved for a profoundly bland dish.

That's all I can think of for now, I might have gone a bit overboard with documenting this (I like writing documentation). None of this is proofread either, it's all just pure stream of thoughts, I hope it made sense at all. Hopefully it was useful in some way, and I will try and hang around to answer questions/fix issues people could raise if I said something wrong.