This is the third article in the series covering my learnings and experiments with the Interprocess Communication mechanisms in XNU. This time we’ll continue exploring APIs related to complex messages and see how to use them to transfer out-of-line data.

Previous parts of the series are:

Unless you’re already familiar with these concepts, I’d recommend reading the previous articles first.

More type descriptors

Part #2 of the series covered bidirectional communication. One of the ways to achieve this was exchanging port rights over Mach messages, using port descriptors. Mach messages with more than just some inline data are referred to as complex, and sharing port rights isn’t their only use case. Another common descriptor type is a one to transfer out-of-line data, an OOL type descriptor.

The power of OOL data, and its advantage over inline data, comes from integration with the virtual memory system. A sender can share entire memory pages with the receiver without manually copying the data into temporary buffers. It is especially beneficial when transferring a large amount of data. The kernel can directly operate on virtual memory mappings to make the transfer as fast as possible and minimize memory usage. The sender can even choose to deallocate the memory regions from its address space during sending, allowing for even more optimizations.

Given an understanding of the general Mach message concepts and message descriptors, transferring OOL data is relatively straightforward in terms of the Mach APIs. More complexity comes with more control over how to transfer the memory using different data exchange options.

This article focuses on the OOL descriptor format and exchanging OOL data between two processes. The virtual memory system integration details and different memory transfer options will be for another time.

OOL descriptor

When sharing data inline, all that’s required is adding extra fields to the message structure. As the name suggests, data transferred using the out-of-line descriptors comes from outside of the message structure. Let’s take a look at the descriptor structure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
typedef struct{
  void*                         address;
#if !defined(__LP64__)
  mach_msg_size_t               size;
#endif
  boolean_t                     deallocate: 8;
  mach_msg_copy_options_t       copy: 8;
  unsigned int                  pad1: 8;
  mach_msg_descriptor_type_t    type: 8;
#if defined(__LP64__)
  mach_msg_size_t               size;
#endif
} mach_msg_ool_descriptor_t;

The fields of the OOL descriptor are:

  • address of the out-of-line data
  • size of the data
  • deallocate, when true the memory page at the address will be removed from the sender’s address space once the message’s been sent
  • copy defines the way of copying the memory
  • type of the message descriptor, for the OOL descriptor, it’s MACH_MSG_OOL_DESCRIPTOR

There are two copy types:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
MACH_MSG_VIRTUAL_COPY -
  When sending a message, the kernel can choose how to
  exactly transmit data. For example, it can decide to actually copy physical
  memory, or make a virtual copy.

  When receiving this means that the kernel made a virtual copy.

MACH_MSG_PHYSICAL_COPY -
  When sending a message, this option instructs the kernel
  to perform an actual copy of the physical memory.

  On the receiving end, this means there's a new physical copy of the memory.

For the scope of this article, you’re only going to need the MACH_MSG_VIRTUAL_COPY copy type.

Note that sending out-of-line data requires merely the data address and size, where both can be dynamic. This is unlike inline data, where indirection isn’t possible, and all data must be copied into the message buffer.

OOL messages example

Now that the theory around the out-of-line message descriptors’s been covered, it’s time to write some more code. We will extend the client/server programs from the previous part, from the first example of bidirectional communication using the reply port. This time, the server program will be the one sending data upon the client’s request, and we’re going to make the client request it.

Let’s start with the client implementation.

Client

To retrieve the OOL data, we’ll need a new message structure type, one that includes the OOL descriptor. The client also has to explicitly request the data from the server, so it needs a new message id to distinguish it from the default messages.

#1 OOL message format

Similar to the port descriptor example, we need a a complex Mach message. The following is the new message structure:

1
2
3
4
5
typedef struct {
  mach_msg_header_t header;
  mach_msg_size_t msgh_descriptor_count;
  mach_msg_ool_descriptor_t descriptor;
} OOLMessage;

OOLMessage consists of the message header, the number of type descriptors, a single OOL descriptor, and it has no inline data.

This time it’s the client that’s on the receiving side, so we also already need a structure with the message trailer:

1
2
3
4
5
6
7
8
typedef struct {
  OOLMessage message;

  // Suitable for use with the default trailer type - no custom trailer
  // information requested using `MACH_RCV_TRAILER_TYPE`, or just the explicit
  // `MACH_RCV_TRAILER_NULL` type.
  mach_msg_trailer_t trailer;
} OOLReceiveMessage;

Let’s also already define the new message id. The client can use it to request OOL data, and the server will use it in the OOL response.

1
#define MSG_ID_COPY_MEM 10

#2 Message receive routine

The client program already has a basic message retrieval routine. Now we need to add one for the OOL message. It’s not really different from other message types. We could use the similar trick with the union type from the previous article, but this time let’s keep it simple and add a dedicated routine:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
mach_msg_return_t
receive_ool_message(mach_port_name_t recvPort, OOLReceiveMessage *rcvMessage) {
  mach_msg_return_t ret = mach_msg(
      /* msg */ (mach_msg_header_t *)&rcvMessage,
      /* option */ MACH_RCV_MSG,
      /* send size */ 0,
      /* recv size */ sizeof(*rcvMessage),
      /* recv_name */ recvPort,
      /* timeout */ MACH_MSG_TIMEOUT_NONE,
      /* notify port */ MACH_PORT_NULL);
  if (ret != MACH_MSG_SUCCESS) {
    return ret;
  }

  if (rcvMessage->message.header.msgh_id != MSG_ID_COPY_MEM) {
    return RCV_ERROR_INVALID_MESSAGE_ID;
  }

  return MACH_MSG_SUCCESS;
}

#3 Sending request

To request the OOL data from the server, we can use the basic Message structure, with the MSG_ID_COPY_MEM id. You must not forget to specify the msgh_local_port field and message bits. Otherwise, the server won’t be able to answer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Message message = {0};
message.header.msgh_size = sizeof(message);

// Setup message ports.
message.header.msgh_remote_port = port;
message.header.msgh_local_port = replyPort;

// Setup message rights. 
message.header.msgh_bits = MACH_MSGH_BITS_SET(
    /* remote */ MACH_MSG_TYPE_COPY_SEND,
    /* local */ MACH_MSG_TYPE_MAKE_SEND,
    /* voucher */ 0,
    /* other */ 0);

// Setup message data.
strcpy(message.bodyStr, "Request OOL");
message.header.msgh_id = MSG_ID_COPY_MEM;

// Send OOL request message.
ret = mach_msg(
    /* msg */ (mach_msg_header_t *)&message,
    /* option */ MACH_SEND_MSG,
    /* send size */ sizeof(message),
    /* recv size */ 0,
    /* recv_name */ MACH_PORT_NULL,
    /* timeout */ MACH_MSG_TIMEOUT_NONE,
    /* notify port */ MACH_PORT_NULL);

#4 Receiving OOL data

After sending the request, the client expects to receive a response with OOL data. Now it’s time to use the OOL message retrieval routine. Upon successfull read, you can see the data contents - assuming here it’s a string, full data size, and how the data was transferred based on the copy field:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
OOLReceiveMessage rcvMessage = {0};
ret = receive_ool_message(replyPort, &rcvMessage);
if (ret != MACH_MSG_SUCCESS) {
  printf("Failed to receive an OOL message: %#x\n", ret);
  return 1;
}

printf(
    "%s\n"
    "  copy option: %d\n"
    "  buffer addr: %p\n"
    "  total buffer size: %#x\n",
    (const char *)rcvMessage.message.descriptor.address,
    rcvMessage.message.descriptor.copy,
    rcvMessage.message.descriptor.address,
    rcvMessage.message.descriptor.size);

Server

On the server side, we need to handle the out-of-line data request. But first, we need to have some data that we can share using the OOL descriptor. Only then can we send the message.

#1 OOL message buffer

To send the OOL data, we need some memory data to share with the client. You could use any dummy, valid memory chunk or allocate one using standard APIs such as malloc. However, since the goal is to learn more about XNU, let’s use its proprietary API - vm_allocate. vm_allocate allows directly allocating virtual memory regions without going through malloc or other higher-level APIs.

Here we allocate a new memory region of a page size - vm_page_size, at a random location and with read/write permissions.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
void *oolBuffer = NULL;
vm_size_t oolBufferSize = vm_page_size;
if (vm_allocate(
        mach_task_self(),
        (vm_address_t *)&oolBuffer,
        oolBufferSize,
        VM_PROT_READ | VM_PROT_WRITE) != KERN_SUCCESS) {
  printf("Failed to allocate memory buffer\n");
  return 1;
}

Once the region is ready, we can fill in some dummy data to share with the clients:

1
strcpy((char *)oolBuffer, "Hello, OOL data!");

#2 Messsage send routine

Now it’s time to send the data. This is also very similar to the other types of messages, so let’s first focus on the OOL type descriptor fields:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
typedef struct {
  mach_msg_header_t header;
  mach_msg_size_t msgh_descriptor_count;
  mach_msg_ool_descriptor_t descriptor;
} OOLMessage;

...

OOLMessage message = {0};
... 

// Specify memory region address and size.
message.descriptor.address = addr;
message.descriptor.size = size;

// Memory region sending options.
// Use `MACH_MSG_VIRTUAL_COPY` for the kernel to decide how to transfer memory
// `deallocate = false` - memory region must not be deallocated after sending
//   the message, as it will be send over and over again to different clients.
message.descriptor.copy = MACH_MSG_VIRTUAL_COPY;
message.descriptor.deallocate = false;

// Specify the type descriptor type - OOL message.
message.descriptor.type = MACH_MSG_OOL_DESCRIPTOR;

And the entire sending routine is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
mach_msg_return_t send_ool_reply(
    mach_port_name_t port,
    const Message *inMessage,
    void *addr,
    mach_msg_size_t size) {
  OOLMessage message = {0};
  message.header.msgh_bits = MACH_MSGH_BITS_SET(
      /* remote */ MACH_MSG_TYPE_COPY_SEND,
      /* local */ 0,
      /* voucher */ 0,
      /* other */ MACH_MSGH_BITS_COMPLEX);
  message.header.msgh_remote_port = port;
  message.header.msgh_id = inMessage->header.msgh_id;
  message.header.msgh_size = sizeof(message);
  message.msgh_descriptor_count = 1;

  message.descriptor.address = addr;
  message.descriptor.size = size;
  message.descriptor.copy = MACH_MSG_VIRTUAL_COPY;
  message.descriptor.deallocate = false;
  message.descriptor.type = MACH_MSG_OOL_DESCRIPTOR;

  return mach_msg(
      /* msg */ (mach_msg_header_t *)&message,
      /* option */ MACH_SEND_MSG,
      /* send size */ sizeof(message),
      /* recv size */ 0,
      /* recv_name */ MACH_PORT_NULL,
      /* timeout */ MACH_MSG_TIMEOUT_NONE,
      /* notify port */ MACH_PORT_NULL);
}

#3 Server loop

The data to send is now in place, and so is the routine to send it. The only remaining piece is to handle the client’s request and send the OOL response. So far, the server was only sending back the default message given the incoming message had a reply port:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
while (true) {
  ...

  // Continue if there's no reply port.
  if (receiveMessage.message.header.msgh_remote_port == MACH_PORT_NULL) {
    continue;
  }

  // Send a response.
  ret = send_reply(
      receiveMessage.message.header.msgh_remote_port,
      &receiveMessage.message);
}

The client program is going to use the dedicated MSG_ID_COPY_MEM message id to request the OOL data. So the server can use the message id field to distinguish the type of a request and send OOL data when needed:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
switch (receiveMessage.message.header.msgh_id) {
case MSG_ID_COPY_MEM:
  // As per request, respond with OOL memory.
  ret = send_ool_reply(
      receiveMessage.message.header.msgh_remote_port,
      &receiveMessage.message,
      oolBuffer,
      oolBufferSize);
  break;
default:
  // Send a generic, inline response.
  ret = send_reply(
      receiveMessage.message.header.msgh_remote_port,
      &receiveMessage.message);
  break;
}

Result

All the pieces are now in place. When you start the server program and run the client, you should see those messages in the client’s output:

1
2
3
4
5
6
7
8
got response message!
  id      : 8
  bodyS   : Response - Hello Mach!
  bodyI   : 510
Hello, OOL data!
  copy option: 1
  buffer addr: 0x10b8a2000
  total buffer size: 0x4000 

The client has sent two messages, and the server has responded to both, with inline and out-of-line data. Hello, OOL data! is the out-of-line data we wanted to transfer, so that worked out. Copy option 1 means MACH_MSG_VIRTUAL_COPY, and thus it looks like a virtual memory copy.

Summary

In this part, we’ve built upon the previously introduced concepts of complex messages and type descriptors to learn how to transfer arbitrary, out-of-line data using Mach messages. We also had a peek into the virtual memory APIs in XNU with the vm_allocate function.

Next time, we will dive into the integration of OOL data in Mach messages with the virtual memory system.

Full implementation of the example can be found at GitHub.


Thanks for reading! If you’ve got any questions, comments, or general feedback, you can find all my social links at the bottom of the page.