Deciphering Glyph :: Opaque Types in Python
Deciphering
Glyph
About Archives Mastodon GitHub Patrons
Opaque Types in Python A proposed technique for exposing an opaque data structure with idiomatic modern Python. pythonprogramming Thursday May 21, 2026
Let’s say you’re writing a Python library. In this library, you have some collection of state that represents “options” or “configuration” for a bunch of operations. Such a set of options is a bundle of potentially ever-increasing complexity. Thus, you will want it to have an extremely minimal compatibility surface, with a very carefully chosen public interface, that is either small, or perhaps nothing at all. Such an object conveys state and might have some private behavior, but all you want consumers to be able to do is build it in very constrained, specific ways, and then pass it along as a parameter to your own APIs. By way of example, imagine that you’re wrapping a library that handles shipping physical packages. There are a zillion ways to do it ship a package. There are different carriers who can ship it for you. There’s air freight, and ground freight, and sea freight. There’s overnight shipping. There’s the option to require a signature. There’s package tracking and certified mail. Suffice it to say, lots of stuff. If you are starting out to implement such a library, you might need an object called something like ShippingOptions that encapsulates some of this. At the core of your library you might have a function like this: 1 2 3 4 5async def shipPackage( how: ShippingOptions, where: Address, ) -> ShippingStatus: ...
If you are starting out implementing such a library, you know that you’re going to get the initial implementation of ShippingOptions wrong; or, at the very least, if not “wrong”, then “incomplete”. You should not want to commit to an expansive public API with a ton of different attributes until you really understand the problem domain pretty well. Yet, ShippingOptions is absolutely vital to the rest of your library. You’ll need to construct it and pass it to various methods like estimateShippingCost and shipPackage. So you’re not going to want a ton of complexity and churn as you evolve it to be more complex. Worse yet, this object has to hold a ton of state. It’s got attributes, maybe even quite complex internal attributes that relate to different shipping services. Right now, today, you need to add something so you can have “no rush”, “standard” and “expedited” options. You can’t just put off implementing that indefinitely until you can come up with the perfect shape. What to do? The tool you want here is the opaque data type design pattern. C is lousy with such things (FILE, pthread_*_t, fd_set, etc). A typedef in a header file can easily achieve this. But in Python, if you expose a dataclass — or any class, really — even if you keep all your fields private, the constructor is still, inherently, public. You can make it raise an exception or something, but your type checker still won’t help your users; it’ll still look like it’s a normal class. Luckily, Python typing provides a tool for this: typing.NewType. Let’s review our requirements:
We need a type that our client code can use in its type annotations; it needs to be public. They need to be able to consruct it somehow, even if they shouldn’t be able to see its attributes or its internal constructor arguments. To express high-level things (like “ship fast”) that should stay supported as we add more nuanced and complex configurations in the future (like “ship with the fastest possible option provided by the lowest-cost carrier that supports signature verification”).
In order to solve these problems respectively, we will use:
a public NewType, which gives us our public name... which wraps a private class with entirely private attributes, to give us an actual data structure, while not exposing the constructor, a set of public constructor functions, which returns our NewType.
When we put that all together, it looks like this: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17from dataclasses import dataclass from typing import Literal, NewType
@dataclass class _RealShipOpts: _speed: Literal["fast", "normal", "slow"]
ShippingOptions = NewType("ShippingOptions", _RealShipOpts)
def shipFast() -> ShippingOptions: return ShippingOptions(_RealShipOpts("fast"))
def shipNormal() -> ShippingOptions: return ShippingOptions(_RealShipOpts("normal"))
def shipSlow() -> ShippingOptions: return ShippingOptions(_RealShipOpts("slow"))
As a snapshot in time, this is not all that interesting; we could have just exposed _RealShipOpts as a public class and saved ourselves some time. The fact that this exposes a constructor that takes a string is not a big deal for the present moment. For an initial quick and dirty implementation, we can just do checks like if options._speed == "fast" in our shipping and estimation code. However, the main thing we are doing here is preserving our flexibility to evolve the related APIs into the future, so let’s see how we might do that. For example, let’s allow the shipping options to contain a concrete and specific carrier and freight method: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35from dataclasses import dataclass from enum import Enum, auto from typing import NewType
class Carrier(Enum): FedEx = auto() USPS = auto() DHL = auto() UPS = auto()
class Conveyance(Enum): air = auto() truck = auto() train = auto()
@dataclass class _RealShipOpts: _carrier: Carrier _freight: Conveyance
ShippingOptions = NewType("ShippingOptions", _RealShipOpts)
def shipFast() -> ShippingOptions: return ShippingOptions(_RealShipOpts(Carrier.FedEx, Conveyance.air))
def shipNormal() -> ShippingOptions: return ShippingOptions(_RealShipOpts(Carrier.UPS, Conveyance.truck))
def shipSlow() -> ShippingOptions: return ShippingOptions(_RealShipOpts(Carrier.USPS, Conveyance.train))
def shippingDetailed( carrier: Carrier, conveyance: Conveyance ) -> ShippingOptions: return ShippingOptions(_RealShipOpts(carrier, conveyance))
As a NewType, our public ShippingOptions type doesn’t have a constructor. Since _RealShipOpts is private, and all its attributes are private, we can completely remove the old versions. Anything within our shipping library can still access the private variables on ShippingOptions; as a NewType, it’s the same type as its base at runtime, so it presents minimal1 overhead. Clients outside our shipping library can still call all of our public constructors: shipFast, shipNormal, and shipSlow all still work with the same (as far as calling code knows) signature and behavior. If you need to build and convey some state within your public API, while avoiding breakages associated with compatibility churn, hopefully this technique can help you do that!
Acknowledgments Thanks for reading, and thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor.
The overhead is minimal, but it is not completely zero. The suggested idiom for converting to a NewType is to call it like a function, as I’ve done in these examples, but if you are wanting to use this pattern inside of a hot loop, you can use # type: ignore[return-value] comments to avoid that small cost. ↩
© Glyph 2025; All Rights Reserved Excepting Those Which Are Not. See my disclosure statements for information on my interests, financial and otherwise.
Mastodon |
The technique of opaque data type design in Python addresses the challenge of exposing complex state structures with a minimal public interface, which is particularly relevant when developing libraries where configuration options evolve over time. When implementing a library, such as one handling shipping options which can involve numerous, potentially ever-increasing attributes, the developer needs an object that encapsulates this state but restricts external interaction to specific, constrained ways of construction. This design aims to separate the necessary internal complexity from the public contract, ensuring API stability as the requirements for the configuration evolve.
The motivation stems from the difficulty of managing evolving complexity. If a class or dataclass is exposed directly, its constructor becomes part of the public interface, potentially forcing changes to that interface whenever internal attributes are added or modified. The author proposes using typing.NewType to create an opaque type that achieves this separation. This involves defining a private underlying class, such as _RealShipOpts, containing the actual, detailed attributes, and then defining the desired public type, ShippingOptions, as a NewType alias to this private class.
To provide controlled access, the implementation relies on creating specific public constructor functions, such as shipFast or shipNormal, which handle the creation of the opaque type by constructing the underlying private structure internally. This strategy ensures that users interact with the system through high-level, semantic methods rather than directly manipulating the internal structure or constructor arguments, thus maintaining control over object state and preventing compatibility churn.
The example demonstrates how this pattern preserves flexibility. Initially, the system might only expose basic speed options. Later, when the requirement expands to include specific carrier and freight methods, the opaque type structure allows the developer to introduce these new, complex attributes within the private definition without affecting the external type signature of ShippingOptions. Since NewType maintains runtime equality with its base type, the internal implementation details remain accessible to the library's internal code, while external code remains type-safe based on the public surface provided by the NewType. Consequently, clients outside the library can still utilize the original public functions while benefiting from the internal evolution of the data structure.
While the overhead of using this pattern is minimal, the approach effectively manages the tension between encapsulating complex state and maintaining a stable, idiomatic Python interface. The suggestion is that invoking the NewType should be handled like a function for most usage, although internal considerations might allow for ignoring type warnings during hot loops to mitigate minor performance costs. This design pattern offers a means to build robust and adaptable APIs by abstracting state management effectively. |