Textures are a central topic in rendering. Although they have many uses, one of their primary purposes is to provide a greater level of detail to surfaces than can be achieved with vertex colors alone.
In this post, we’ll talk about texture mapping, which helps us bring virtual characters to life. We’ll also introduce samplers, which give us powerful control over how texture data is interpreted while drawing. Along the way, we will be assisted by a cartoon cow named Spot.
You can download the sample code for this post here.
Textures are formatted image data. This makes them distinct from buffers, which are unstructured blocks of memory. The types of textures we will be working with in this post are 2D images. Although Metal supports several other kinds of textures, we can introduce almost all of the important concepts by walking through an example of texture mapping.
Texture mapping is the process of associating each vertex in a mesh with a point in a texture. This is similar to wrapping a present, in that a 2D sheet of wrapping paper (a texture) is made to conform to a 3D present (the mesh).
Texture mapping is usually done with a specialized editing tool. The mesh is unwrapped into a planar graph (a 2D figure that maintains as much of the connectivity of the 3D model as possible), which is then overlaid on the texture image.
The following figure shows a 3D model of a cow (with the edges of its constituent faces highlighted) and its corresponding texture map. Notice that the various parts of the cow (head, torso, legs, horns) have been separated on the map even though they are connected on the 3D model.
In Metal, the origin of the pixel coordinate system of a texture coincides with its top left corner. This is the same as the coordinate system in UIKit. However, it differs from the default texture coordinate system in OpenGL, where the origin is the bottom left corner.
The axes of the texture coordinate system are often labeled
v) to distinguish them from the
y axes of the world coordinate system.
The coordinate system of the texture coordinates must agree with the coordinate system of the image data. Since it is most natural to work in an image editor where the origin is in the top left, the saved image will often be upside-down from the perspective of the Metal texture coordinate system. This can be solved either when the image is written, by storing the image “upside-down”, or when the image is read, by flipping the image data vertically before it is copied into the texture.
The sample code chooses to flip images when loading. The image utilities from UIKit are used to load the image, which is then transformed into the Metal texture coordinate system by drawing it into a “flipped” context. The code for this is shown in a later section.
Pixel Coordinates Versus Normalized Coordinates
Metal is flexible enough to allow us to specify texture coordinates in pixel coordinates or normalized coordinates. Pixel coordinates range from (0, 0) to (width – 1, height – 1). They therefore depend on the dimensions of the texture image. Normalized coordinates range from (0, 0) to (1, 1), which makes them independent of image size.
I have chosen to use normalized coordinates throughout this post.
Textures are discrete images composed of a finite number of pixels (called texels). However, when drawing, a texture may be drawn at a resolution that is higher or lower than its native size. Therefore, it is important to be able to determine what the color of a texture should be between its texels, or when many texels are crunched into the same space. When a texture is being drawn at a size higher than its native size, this is called magnification. The inverse process, where a texture is drawn below its native resolution, is called minification.
The process of reconstructing image data from a texture is called filtering. Metal offers two different filtering modes: nearest and linear.
Nearest (also called “nearest-neighbor”) filtering simply selects the closest texel to the requested texture coordinate. This has the advantage of being very fast, but it can cause the rendered image to appear blocky when textures are magnified (i.e., when each texel covers multiple pixels).
Linear filtering selects the four nearest texels and produces a weighted average according to the distance from the sampled coordinate to the texels. Linear filtering produces much more visually-pleasing results than nearest-neighbor filtering and is sufficiently fast to be performed in real-time.
When a texture is minified, multiple texels may coincide with a single pixel. Even slight motion in the scene can cause a shimmering phenomenon to appear. Linear filtering does not help the situation, as the set of texels covering each pixel changes from frame to frame.
One way to abate this issue is to prefilter the texture image into a sequence of images, called mipmaps. Each mipmap in a sequence is half the size of the previous image, down to a size of 1×1. When a texture is being minified, the mipmap at the closest resolution is substituted for the original texture image.
The name “mipmap” comes from the Latin phrase multum in parvo, which roughly means “much in little”.
Although it is possible to build your own mipmap sequence manually, Metal can do this for you when creating a texture. The
texture2DDescriptorWithPixelFormat:width:height:mipmapped: convenience method (which we will use later) will calculate the number of mipmap levels necessary and assign it to the texture descriptor it returns. Otherwise, you are obligated to set the
mipmapLevelCount property to the appropriate value, which happens to be .
The sample code does not demonstrate the use of mipmapping.
The following discussion is framed in terms of normalized coordinates. If you prefer pixel coordinates, simply replace “1” with “width” or “height” where appropriate.
Typically, when associating texture coordinates with the vertices of a mesh, the values are constrained to [0, 1] along both axes. However, this is not always the case. Negative texture coordinates, or coordinates greater than 1 can also be used. When coordinates outside the [0, 1] range are used, the addressing mode of the sampler comes into effect. There are a variety of different behaviors that can be selected.
In clamp-to-edge addressing, the value along the edge of the texture is repeated for out-of-bounds values.
In clamp-to-zero addressing, the sampled value for out-of-bounds coordinates is either black or clear, depending on whether the texture has an alpha color component.
In repeat addressing, out-of-bounds coordinates wrap around the corresponding edge of the texture and repeat from zero. In other words, the sampled coordinates are the fractional parts of the input coordinates, ignoring the integer parts.
Mirrored Repeat Addressing
In mirrored repeat addressing, the sampled coordinates first increase from 0 to 1, then decrease back to 0, and so on. This causes the texture to be flipped and repeated across every other integer boundary.
Creating Textures in Metal
Before we actually ask Metal to create a texture object, we need to know a little more about pixel formats, as well as how to get texture data into memory.
A pixel format describes the way color information is laid out in memory. There are a few different aspects to this information: the color components, the color component ordering, the color component size, and the presence or absence of compression.
The common color components are red, green, blue, and alpha (transparency). These may all be present (as in an RGBA format), or one or more may be absent. In the case of a fully-opaque image, alpha information is omitted.
Color component ordering refers to which color components appear in memory: BGRA or RGBA.
Colors may be represented with any degree of precision, but two popular choices are 8 bits per component and 32 bits per component. Most commonly, when 8 bits are used, each component is an unsigned 8-bit integer, a value between 0 and 255. When 32 bits are used, each component is usually a 32-bit float ranging from 0.0 to 1.0. Obviously, 32 bits offer far greater precision than 8 bits, but 8 bits is usually sufficient for capturing the perceivable differences between colors, and is much better from a memory standpoint.
The pixel formats supported by Metal are listed in the
Loading Image Data
We will use the powerful utilities provided by UIKit to load images from the application bundle. To create a
UIImage instance from an image in the bundle, we only need one line of code:
UIImage *image = [UIImage imageNamed:textureName];
Unfortunately, UIKit does not provide a way to access the underlying bits of a
UIImage. Instead, we have to draw the image into a Core Graphics bitmap context that has the same format as our desired texture. As part of this process, we transform the context (with a translation followed by a scale) such that the result image bits are flipped vertically. This causes the coordinate space of our image to agree with Metal’s texture coordinate space.
CGImageRef imageRef = [image CGImage]; // Create a suitable bitmap context for extracting the bits of the image NSUInteger width = CGImageGetWidth(imageRef); NSUInteger height = CGImageGetHeight(imageRef); CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB(); uint8_t *rawData = (uint8_t *)calloc(height * width * 4, sizeof(uint8_t)); NSUInteger bytesPerPixel = 4; NSUInteger bytesPerRow = bytesPerPixel * width; NSUInteger bitsPerComponent = 8; CGContextRef context = CGBitmapContextCreate(rawData, width, height, bitsPerComponent, bytesPerRow, colorSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big); CGColorSpaceRelease(colorSpace); // Flip the context so the positive Y axis points down CGContextTranslateCTM(context, 0, height); CGContextScaleCTM(context, 1, -1); CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef); CGContextRelease(context);
A texture descriptor is a lightweight object that specifies the dimensions and format of a texture. When creating a texture, you provide a texture descriptor and receive an object that conforms to the
MTLTexture protocol, which is a subprotocol of
MTLResource. The properties specified on the texture descriptor (texture type, dimensions, and format) are immutable once the texture has been created, but you can still update the content of the texture as long as the pixel format of the new data matches the pixel format of the receiving texture.
MTLTextureDescriptor class provides a couple of factory methods for building common texture types. To describe a 2D texture, you must specify the pixel format, texture dimensions in pixels, and whether Metal should allocate space to store the appropriate mipmap levels for the texture.
[MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatRGBA8Unorm width:width height:height mipmapped:YES];
Creating a Texture Object
It is now quite straightforward to create a texture object. We simply request a texture from the device by supplying a valid texture descriptor:
id<MTLTexture> texture = [self.device newTextureWithDescriptor:textureDescriptor];
Updating Texture Contents
Setting the data in the texture is also quite simple. We create a
MTLRegion that represents the entire texture and then tell the texture to replace that region with the raw image bits we previously retrieved from the context:
MTLRegion region = MTLRegionMake2D(0, 0, width, height); [texture replaceRegion:region mipmapLevel:0 withBytes:rawData bytesPerRow:bytesPerRow];
Passing Textures to Shader Functions
This texture is now ready to be used in a shader. To pass a texture to a shader function, we set it on our command encoder right before we issue our draw call:
[commandEncoder setFragmentTexture:texture atIndex:0];
This texture can now be referred to by index with the attribute
[[texture(0)]] in a shader function’s parameter list.
In Metal, a sampler is an object that encapsulates the various render states associated with reading textures: coordinate system, addressing mode, and filtering. It is possible to create samplers in shader functions or in application code. We will discuss each in turn in the following sections.
Creating Samplers in Shaders
We will use samplers in our fragment function, because we want to produce a different color for each pixel in the rendered image. So, it sometimes makes sense to create samplers directly inside a fragment function.
The following code creates a sampler that will sample in the
normalized coordinate space, using the
repeat addressing mode, with
constexpr sampler s(coord::normalized, address::repeat, filter::linear);
Samplers that are local to a shading function must be qualified with
constexpr. This keyword, new in C++11, signifies that an expression may be computed at compile-time rather than runtime. This means that just one sampler struct will be created for use across all invocations of the function.
coord value may either be
The possible values for
The possible values for
nearest and linear`, to select between nearest-neighbor and linear filtering.
All of these values belong to strongly-typed enumerations. Strongly typed enumerations are a new feature in C++11 that allow stricter type-checking of enumerated values. It is an error to omit the type name of the value (i.e., you must say
filter::linear and not simply
The parameters may be specified in any order, since the constructor of the sampler is implemented as a variadic template function (another new feature of C++11).
Using a Sampler
Getting a color from a sampler is straightforward. Textures have a
sample function that takes a sampler and a set of texture coordinates, returning a color. In a shader function, we call this function, passing in a sampler and the (interpolated) texture coordinates of the current vertex.
float4 sampledColor = texture.sample(sampler, vertex.textureCoords);
The sampled color may then be used in whatever further computation you want, such as lighting.
Creating Samplers in Application Code
To create a sampler in application code, we fill out a
MTLSamplerDescriptor object and the ask the device to give us a sampler (of type
MTLSamplerDescriptor *samplerDescriptor = [MTLSamplerDescriptor new]; samplerDescriptor.minFilter = MTLSamplerMinMagFilterNearest; samplerDescriptor.magFilter = MTLSamplerMinMagFilterLinear; samplerDescriptor.sAddressMode = MTLSamplerAddressModeRepeat; samplerDescriptor.tAddressMode = MTLSamplerAddressModeRepeat; sampler = [device newSamplerStateWithDescriptor:samplerDescriptor];
Note that we had to individually specify the magnification and minification filters. When creating samplers in shader code, we used the
filter parameter to specify both at once, but we could also use
min_filter separately to get the same behavior as above.
Similarly, the address mode must be set separately for each texture axis, whereas we did this with the sole
address parameter in shader code.
Passing Samplers as Shader Arguments
Passing samplers looks very similar to passing textures, but samplers reside in a different set of argument table slots, so we use a different method to bind them:
[commandEncoder setFragmentSamplerState:sampler atIndex:0];
We can now refer to this sampler by attributing it with
[[sampler(0)]] in shader code.
The Sample Project
The sample code for this post borrows heavily from the earlier post on lighting and rendering in 3D, so the shared concepts are not explained in this post.
The major change is the use of a texture/sampler pair to determine the diffuse color of the mesh at each pixel instead of interpolating a single color across the whole surface. This allows us to draw our textured cow model, producing a greater sense of realism and detail in the scene.
You can download the sample project here. If you run it on a Metal-enabled device, you can spin Spot around in 3D to show off those specular highlights. Be careful, though: if you spin her too fast, you might hear her moo in protest…