Checked-size array parameters in C
Recorded: Dec. 4, 2025, 3:06 a.m.
| Original | Summarized |
Checked-size array parameters in C [LWN.net] LWN.net ContentWeekly EditionArchivesSearchKernelSecurityEvents calendarUnread commentsLWN FAQWrite for us
User: | Log in / Checked-size array parameters in C Welcome to LWN.net The following subscription-only content has been made available to you By Jonathan CorbetDecember 1, 2025 The discussion started when Ard Biesheuvel sought to void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, A potential problem with this function is that it takes as parameters Biesheuvel suggested that it was possible to write the prototype this way void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, The types of the last two arguments have changed; there is a new level of Jason Donenfeld was void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, This, too, will cause the compiler to check the sizes of the arrays, and it Eric Biggers pointed out Torvalds, as it turns out, has The main issue with the whole 'static' thing is just that the He pointed out that there are a number of places in the kernel that are A lot of work has gone into the kernel to make its use of C as safe as to post comments [static n] should be avoided if possible Posted Dec 2, 2025 0:54 UTC (Tue) [n] should be as fine as [static n], and even better. It's unfortunate that the kernel had to turn -Wstringop-overflow off; I hope that can be resolved eventually, and code using [n] can be safe again. [static n] is not as good as [n], among other reasons, because it doesn't guarantee that the function will not read beyond the first 'n' elements. GCC diagnoses if a function reads beyond that, but that's a GNU extension. [static n] as designed by the C Committee is bad, and should be avoided. [static n] should be avoided if possible Posted Dec 2, 2025 9:10 UTC (Tue) I'm not sure I understood that last bit. I thought that neither [static n] nor [n] can guarantee the function will not read beyond the first n elements (although either might warn). AFAICT the C standard defines static array indices as an single additional constraint on callers (e.g. there's nothing that [n] does that [static n] doesn't also do). Assuming that constraint is met then I don't see why [static n] is would be not as good as [n] when the function implementation does not honour its prototype. [static n] should be avoided if possible Posted Dec 2, 2025 9:47 UTC (Tue) It depends on what you call a guarantee. If you use appropriate compiler flags, you can get such guarantees alx@devuan:~/tmp$ cat array-bounds.c void g(a); The C standard claims [n] has no meaning at all, so it's easy for the C standard to also claim that [static n] has strictly stronger guarantees than [n]. However, the C standard doesn't acknowledge that existing quality compilers (i.e., GCC) assign [n] a stronger meaning than the C standard ackowledge. About why [static n] would be worse than [n], there are a few reasons (plus the one mentioned above): [static n] implies [[gnu::nonnull()]] on the parameter, which is a UB bomb. [static n] should be avoided if possible Posted Dec 2, 2025 10:09 UTC (Tue) The a[100]=7 line only causes a warning/error at -O2, not at -O1 or -O0 (with GCC 13.3.0). So it can sometimes be a useful heuristic (when it's not causing too many false positives), but it's not what anyone would call a guarantee. [static n] should be avoided if possible Posted Dec 2, 2025 10:25 UTC (Tue) I'm not sure how much heuristics are involved. I haven't seen the implementation. But yeah, I agree the diagnostics about array parameters need some improvements. On the other hand, -O2 is what most projects use. I personally build my projects in a loop with all different optimization levels, to maximize diagnostics. [static n] should be avoided if possible Posted Dec 2, 2025 15:28 UTC (Tue) I think we're on the same page, although perhaps coming from opposite directions. From the point of view of the standard [static n] is essentially a hint to the optimizer that allows it to optimize the function more aggressively by imposing restrictions on the caller. Nothing to do with diagnostics, although without diagnostics then there is a risk that failing to meet the restrictions would go undiscovered. Given how similar they are, I'd assume the only reason [static n] would ever get "better" diagnostics might be because the false positive rate is lower: if the warnings can be switched independently then people who adopt the weird syntax for diagnostic reasons could simply switch back to [n] if there are false positive diagnostics. [static n] is 100% useless (on top of being bad for safety) Posted Dec 2, 2025 12:32 UTC (Tue) My previous comment mentioned that [static n] is dangerous. However, it's worse than that. There's nothing that [static n] will enforce that wouldn't be enforced by [n]. Both GCC and Clang entirely ignore [static n] --as they should, it's pure crap-- regarding diagnostics of array bounds. I can only assume they were assuming [static n] would do something it doesn't do but didn't really test, because it's trivial to show that [static n] doesn't add any diagnostics. See <https://lwn.net/ml/all/ei7wbiu6m2lvso3gbc4ohvz3h575anjxqm...> Waste of developer mental capacity Posted Dec 2, 2025 8:22 UTC (Tue) Imagine a world where developers of a complex piece of software like an OS kernel would not have to worry about low level compiler issues like this that language creators and compiler developers knew how to solve for many decades. Unfortunately compilers don't enforce [static N] Posted Dec 2, 2025 13:24 UTC (Tue) I wished that [static N] would be used more widely. We could have had compile-time array-length and null-ptr tests in C for years. ([static 1] is a non-null pointer here.) However, citing C23 6.7.7.4, paragraph 6: "If the keyword static also appears within the [ and ] of the array type derivation, then for each call to This is essentially just an optimization hint. There is no requirement to do any bounds checking. When I tried a few years ago, clang did while gcc didn't warn about violations. GCC enforces both [n] and [static N] Posted Dec 2, 2025 13:32 UTC (Tue) I don't know in the past, but in the present day, Clang is way behind GCC. GCC diagnoses both [n] and [static n] in exactly the same way, and [static n] is unnecessary. alx@devuan:~/tmp$ cat array-bounds.c void f(int a[42]); void gs(int a[static 43]); void fs(int a[static 42]); int f(a); GCC enforces both [n] and [static N] Posted Dec 2, 2025 16:27 UTC (Tue) That diagnosis for [n] is wrong though; the C standard is very clear that array parameters are identical to pointer parameters, with static within the square brackets being the sole case where there is any additional meaning conveyed by the use of an array. See C99 6.7.5.3. If anything the correct behaviour would be to warn on the use of [n] in a parameter type as "this does not do what you think it does; if you don't mean [static n], [] or * would be a less misleading way to write it". Quality of implementation Posted Dec 2, 2025 21:32 UTC (Tue) > That diagnosis for [n] is wrong though; That's what we call quality-of-implementation. It's not a bug, it's a feature. Quality compilers often diagnose things that the standard doesn't restrict. That's expected. You wouldn't want to use a compiler that diagnoses exactly what the standard requires and no more. > the C standard is very clear that array parameters are identical to pointer parameters, That's a bug in the standard. I hope it will be fixed in ISO C2y. Re: [static 1] is a non-null pointer Posted Dec 2, 2025 13:38 UTC (Tue) [static 1] is a bad version of __attribute__((nonnull())). - It uses array notation for what is really a pointer. Re: [static 1] is a non-null pointer Posted Dec 2, 2025 21:10 UTC (Tue) Have you answered to the original email thread? It seems you have a point, but I don't think people from that thread reading comments under this article. Re: [static 1] is a non-null pointer Posted Dec 2, 2025 21:34 UTC (Tue) Yup, we're talking about it. Weird corner of C Posted Dec 2, 2025 16:42 UTC (Tue) Array parameter types are a particularly cursed corner of C. By the standard, they are confusing syntax for a pointer. But that means that you can legally assign to the array itself: int get_second(int a[]) { If you want to stop that, like any other variable you want to prevent assignment to, you need to mark it const. But int const a[] makes the array elements pointed-to objects const, not the pointer itself. The cursed solution that C came up with is putting the qualifiers inside the square brackets (where static can also go, as explained in the article), so you can write the following: int get_second(int a[const]) { Other qualifiers work there too (e.g. restrict, which you do sometimes see in the real world). Weird corner of C Posted Dec 2, 2025 19:06 UTC (Tue) Array-to-pointer decay is the monkey wrench in the C machine, notwithstanding that pointers are C's secret sauce. I wonder how things would have turned out if the committee had adopted Dennis Ritchie's array passing proposal, "Variable-Size Arrays in C", http://jclt.iecc.com/Jct22.pdf#page=5, instead of VLAs. Both proposals had to work around the unfortunate array parameter semantics in similar ways, though. To editor: example using `at_least` needed Posted Dec 3, 2025 3:22 UTC (Wed) It would be nice if Jon would post an example that uses the new at_least macro, because people often use these LWN articles instructively, and I'd hate to have to cleanup naked uses of static. Details are in the commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/... To editor: example using `at_least` needed Posted Dec 3, 2025 17:04 UTC (Wed) Sorry, at_least was a late addition. Perhaps the best example is this whole series of changes posted by Eric Biggers. To editor: example using `at_least` needed Posted Dec 3, 2025 17:46 UTC (Wed) Or just patch 3/3 in the original series? It shows the error messages that a violation would create etc. https://lore.kernel.org/all/20251123054819.2371989-4-Jaso... bugged gcc Posted Dec 4, 2025 2:14 UTC (Thu) That commit by Linus to remove the array size warning indicates that on several versions of gcc, various target architectures get false-positives. Very interesting! I wonder what could be causing that…
Copyright © 2025, Eklektix, Inc. |
The kernel’s array parameter semantics are a bizarre corner of C, a source of both frustration and occasional clever solutions. Despite lacking any specific safety features, the language’s unusual array parameter type syntax has been exploited in a number of places, including the crypto layer, to prevent memory-safety and swapped-argument problems. The story began with Ard Biesheuvel, seeking to improve the safety of the `xchacha20poly1305_encrypt()` function, which takes as parameters several pointers to arrays of type `u8`. A key concern was the fact that the size of the nonce and key arrays were not checked by the compiler; it was easily possible to pass the arguments in the wrong order. Biesheuvel suggested a potentially better prototype, changing the function to accept pointers to arrays of a given size. This involved adding an additional `&` operator to obtain the desired pointer type, but it was still a somewhat clunky solution. Jason Donenfeld, however, proposed a more straightforward way to address the problem. He pointed out that buried deep within the C standard, is a peculiar usage of the `static` keyword, allowing the prototype to be written: `void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, const u8 *ad, const size_t ad_len, const u8 (*nonce)[XCHACHA20POLY1305_NONCE_SIZE], const u8 (*key)[CHACHA20POLY1305_KEY_SIZE]);` This approach avoids the need for callers to add an additional `&` operator. As Eric Biggers noted, GCC can often generate “array too small” warnings even without the use of `static`. However, the kernel currently disables these warnings when `static` is used. This was a deliberate decision made to suppress false positives. Notably, the disabling of warnings was first implemented in 6.8 due to their unreliability. Nonetheless, the potential for using `static` to generate warnings proved valuable for Clang, and Linus Torvalds conceded that it was a worthwhile approach, despite its odd syntax. He admired the feature, even if the implementation itself was somewhat flawed. Torvalds acknowledged that there are several places in the kernel that already utilize this technique; the virtual-terminal driver is one prominent example. He suggested perhaps masking the usage with a macro, like `min_array_size()`, to improve readability, but didn’t seem convinced that it was necessary. Donenfeld followed up with a patch to introduce such a macro, but then pivoted to an `at_least` marker. The use of `static` is not unique to this particular function; it’s a common practice in the kernel, and has been used for decades. The authors have made considerable efforts to make the kernel’s use of C as safe as possible, and this does not mean that all low-hanging fruit has been picked. As a reminder, the language has features, such as `static` when used to define formal array parameters, that can improve safety, but these are not generally known and are not often used. In this instance, it would not be surprising to see this “horrible hack” come into wider use in the future. The `at_least` marker is, in a sense, an evolution of the original problem, offering a more systematic way to add checks on array sizes. It was introduced by Eric Biggers and has since become part of the kernel. The issue highlighted by the original problem – that small details of the C language could have significant security implications – remains ongoing, and the kernel continues to seek ways to address these challenges effectively. The authors acknowledge the curious nature of this corner of C, pointing out that it's just one of several unexpected behaviors associated with the language which, due to legacy compatibility concerns, is often difficult to change. The situation also underscores the ongoing effort to balance safety and compatibility in a complex and established language like C. To editor: example using `at_least` needed The concerns about false positives highlight the challenges in applying diagnostic tools to complex codebases. The `at_least` marker represents an attempt to mitigate this problem, but it’s unlikely to eliminate it entirely. The ongoing discussion demonstrates the need for careful consideration when designing and implementing safety checks, and it highlights the importance of understanding the nuances of the C programming language. Bugged gcc The fact that GCC generates false-positives in certain scenarios underscores the difficulty of creating deterministic diagnostic tools. The varied behavior across different target architectures further complicates matters. The authors’ decision to disable warnings in 6.8 was a pragmatic one, designed to minimize false alarms and reduce developer frustration. They note that this choice reflects the greater need to refine diagnostic tools before they can be relied upon for robust safety checks. To editor: example using `at_least` needed The value of an illustrative example for the new `at_least` marker cannot be overstated, as the authors recognize. The marker itself is a relatively new addition to the kernel, and the example would serve to clarify its purpose and demonstrate its practical use. The authors are committed to providing additional context, and the example is intended to improve understanding of this particular safety feature . |