Translators
Earlier, in Stability, we learned about how DTrace computes and reports program stability attributes. Ideally, we would like to construct our DTrace programs by consuming only Stable or Evolving interfaces. Unfortunately, when debugging a low-level problem or measuring system performance, you may need to enable probes that are associated with internal operating system routines such as functions in the kernel, rather than probes associated with more stable interfaces such as system calls. The data available at probe locations deep within the software stack is often a collection of implementation artifacts rather than more stable data structures such as those associated with the illumos system call interfaces. In order to aid you in writing stable D programs, DTrace provides a facility to translate implementation artifacts into stable data structures accessible from your D program statements.
40.1. Translator Declarations
A translator is a collection of D assignment statements provided by the supplier of
an interface that can be used to translate an input expression into an object of struct
type. To understand the need for and use of translators, we'll consider as an example
the ANSI-C standard library routines defined in stdio.h. These routines operate on a data structure named FILE
whose implementation artifacts are abstracted away from C programmers. A standard
technique for creating a data structure abstraction is to provide only a forward declaration
of a data structure in public header files, while keeping the corresponding struct
definition in a separate private header file.
If you are writing a C program and wish to know the file descriptor corresponding
to a FILE
struct, you can use the fileno(3C) function to obtain the descriptor rather than dereferencing a member of the FILE
struct directly. The illumos header files enforce this rule by defining FILE as an
opaque forward declaration tag so it cannot be dereferenced directly by C programs
that include <stdio.h>
. Inside the libc.so.1 library, you can imagine that fileno
is implemented in C something like this:
int
fileno(FILE *fp)
{
struct file_impl *ip = (struct file_impl *)fp;
return (ip->fd);
}
Our hypothetical fileno
takes a FILE
pointer as an argument and
casts it to a pointer to a corresponding internal libc structure,
struct file_impl
, and then returns the value of the fd
member of the implementation
structure. Why does illumos implement interfaces like this? By abstracting the details
of the current
libc implementation away from client programs, Sun is able to maintain a commitment to
strong binary compatibility while continuing to evolve and change the internal implementation
details of
libc. In our example, the fd
member could change size or position
within struct file_impl
, even in a patch, and existing binaries calling
fileno(3C) would not be
affected by this change because they do not depend on these artifacts.
Unfortunately, observability software such as DTrace has the need to peer inside the
implementation in
order to provide useful results, and does not have the luxury of calling arbitrary
C functions defined in
illumos libraries or in the kernel. You could declare a copy of struct file_impl
in your D
program in order to instrument the routines declared in stdio.h, but then your D program
would rely on Private implementation artifacts of the library that might break in
a future micro or minor
release, or even in a patch. Ideally, we want to provide a construct for use in D
programs that is bound to
the implementation of the library and is updated accordingly, but still provides an
additional layer of
abstraction associated with greater stability.
A new translator is created using a declaration of the form:
translator output-type < input-type input-identifier > {
member-name = expression ;
member-name = expression ;
...
};
The output-type names a struct that will be the result type for the
translation. The input-type specifies the type of the input expression, and is
surrounded in angle brackets < >
and followed by an
input-identifier that can be used in the translator expressions as an alias for the
input expression. The body of the translator is surrounded in braces { }
and terminated
with a semicolon (;
), and consists of a list of member-name and
identifiers corresponding translation expressions. Each member declaration must name
a unique member of the
output-type and must be assigned an expression of a type compatible with the member
type, according to the rules for the D assignment (=
) operator.
For example, we could define a struct of stable information about stdio files based on some of the available libc interfaces:
struct file_info {
int file_fd; /* file descriptor from fileno(3C) */
int file_eof; /* eof flag from feof(3C) */
};
A hypothetical D translator from FILE
to file_info
could then be declared in D as follows:
translator struct file_info < FILE *F > {
file_fd = ((struct file_impl *)F)->fd;
file_eof = ((struct file_impl *)F)->eof;
};
In our hypothetical translator, the input expression is of type FILE *
and is assigned the input-identifier F
. The identifier F
can then be used in the translator member expressions as a variable of type FILE *
that is only visible within the body of the translator declaration. To determine
the value of the output file_fd
member, the translator performs a cast and dereference similar to the hypothetical
implementation of fileno(3C) shown above. A similar translation is performed to obtain the value of the EOF indicator.
Sun provides a set of translators for use with illumos interfaces that you can invoke from your D programs, and promises to maintain these translators according to the rules for interface stability defined earlier as the implementation of the corresponding interface changes. We'll learn about these translators later in the chapter, after we learn how to invoke translators from D. The translator facility itself is also provided for use by application and library developers who wish to offer their own translators that D programmers can use to observe the state of their software packages.
40.2. Translate Operator
The D operator xlate
is used to perform a translation from an input expression to one of the defined translation
output structures. The xlate
operator is used in an expression of the form:
xlate < output-type > ( input-expression )
For example, to invoke the hypothetical translator for FILE structs defined above
and access the file_fd
member, you would write the expression:
xlate <struct file_info *>(f)->file_fd;
where f
is a D variable of type FILE *
. The xlate
expression itself is assigned the type defined by the output-type. Once a translator is defined, it can be used to translate input expressions to either
the translator output struct type, or to a pointer to that struct.
If you translate an input expression to a struct, you can either dereference a particular
member of the output immediately using the “.
” operator, or you can assign the entire translated struct to another D variable to
make a copy of the values of all the members. If you dereference a single member,
the D compiler will only generate code corresponding to the expression for that member.
You may not apply the &
operator to a translated struct to obtain its address, as the data object itself
does not exist until it is copied or one of its members is referenced.
If you translate an input expression to a pointer to a struct, you can either dereference
a particular member of the output immediately using the ->
operator, or you can dereference the pointer using the unary *
operator, in which case the result behaves as if you translated the expression to
a struct. If you dereference a single member, the D compiler will only generate code
corresponding to the expression for that member. You may not assign a translated pointer
to another D variable as the data object itself does not exist until it is copied
or one of its members is referenced, and therefore cannot be addressed.
A translator declaration may omit expressions for one or more members of the output
type. If an xlate
expression is used to access a member for which no translation expression is defined,
the D compiler will produce an appropriate error message and abort the program compilation.
If the entire output type is copied by means of a structure assignment, any members
for which no translation expressions are defined will be filled with zeroes.
In order to find a matching translator for an xlate
operation, the D compiler examines the set of available translators in the following
order:
-
First, the compiler looks for a translation from the exact input expression type to the exact output type.
-
Second, the compiler resolves the input and output types by following any typedef aliases to the underlying type names, and then looks for a translation from the resolved input type to the resolved output type.
-
Third, the compiler looks for a translation from a compatible input type to the resolved output type. The compiler uses the same rules as it does for determining compatibility of function call arguments with function prototypes in order to determine if an input expression type is compatible with a translator's input type.
If no matching translator can be found according to these rules, the D compiler produces an appropriate error message and program compilation fails.
40.3. Process Model Translators
The DTrace library file /usr/lib/dtrace/procfs.d provides a set of translators for use in your D programs to translate from the operating
system kernel implementation structures for processes and threads to the stable proc(4) structures psinfo
and lwpsinfo
. These structures are also used in the illumos /proc filesystem files /proc/
pid/psinfo
and /proc/
pid/lwps/
lwpid/lwpsinfo
, and are defined in the system header file /usr/include/sys/procfs.h
. These structures define useful Stable information about processes and threads such
as the process ID, LWP ID, initial arguments, and other data displayed by the ps(1) command. Refer to proc(4) for a complete description of the struct members and semantics.
Input Type |
Input Type Attributes |
Output Type |
Output Type Attributes |
---|---|---|---|
|
Private/Private/Common |
|
Stable/Stable/Common |
|
Private/Private/Common |
|
Stable/Stable/Common |
40.4. Stable Translations
While a translator provides the ability to convert information into a stable data structure, it does not necessarily resolve all stability issues that can arise in translating data. For example, if the input expression for an xlate operation itself references Unstable data, the resulting D program is also Unstable because program stability is always computed as the minimum stability of the accumulated D program statements and expressions. Therefore, it is sometimes necessary to define a specific stable input expression for a translator in order to permit stable programs to be constructed. The D inline mechanism can be used to facilitate such stable translations.
The DTrace procfs.d library provides the curlwpsinfo
and curpsinfo
variables described earlier as stable translations. For example, the curlwpsinfo
variable is actually an inline
declared as follows:
inline lwpsinfo_t *curlwpsinfo = xlate <lwpsinfo_t *> (curthread);
#pragma D attributes Stable/Stable/Common curlwpsinfo
The curlwpsinfo
variable is defined as an inlined translation from the curthread
variable, a pointer to the kernel's Private data structure representing a thread,
to the Stable lwpsinfo_t
type. The D compiler processes this library file and caches the inline
declaration, making curlwpsinfo
appear as any other D variable. The #pragma
statement following the declaration is used to explicitly reset the attributes of
the curlwpsinfo
identifier to Stable/Stable/Common, masking the reference to curthread
in the inlined expression. This combination of D features permits D programmers to
use curthread as the source of a translation in a safe fashion that can be updated
by Sun coincident to corresponding changes in the illumos implementation.