Trip Report: C++ Standards Meeting in Jacksonville, March 2018

Summary / TL;DR

Project What’s in it? Status
C++17 See list Published!
C++20 See below On track
Library Fundamentals TS v2 source code information capture and various utilities Published! Parts of it merged into C++17
Concepts TS Constrained templates Merged into C++20 with some modifications
Parallelism TS v2 Task blocks, library vector types and algorithms, and more Sent out for PDTS ballot
Transactional Memory TS Transaction support Published! Not headed towards C++20
Concurrency TS v1 future.then(), latches and barriers, atomic smart pointers Published! Parts of it merged into C++20, more on the way
Executors Abstraction for where/how code runs in a concurrent context Reached design consensus. Ship vehicle not decided yet.
Concurrency TS v2 See below Under development. Depends on Executors.
Networking TS Sockets library based on Boost.ASIO Publication imminent
Ranges TS Range-based algorithms and views Published!
Coroutines TS Resumable functions, based on Microsoft’s await design Published!
Modules TS A component system to supersede the textual header file inclusion model Voted for publication!
Numerics TS Various numerical facilities Under active development
Graphics TS 2D drawing API Under design review; some controversy
Reflection TS Code introspection and (later) reification mechanisms Initial working draft containing introspection proposal passed wording review
Contracts Preconditions, postconditions, and assertions Proposal under wording review, targeting C++20

A few links in this blog post may not resolve until the committee’s post-meeting mailing is published (expected within a few days of April 2, 2018). If you encounter such a link, please check back in a few days.

Introduction

A couple of weeks ago I attended a meeting of the ISO C++ Standards Committee (also known as WG21) in Jacksonville, Florida. This was the first committee meeting in 2018; you can find my reports on 2017’s meetings here (February 2017, Kona), here (July 2017, Toronto), and here (November 2017, Albuquerque). These reports, particularly the Albuquerque one, provide useful context for this post.

With the final C++17 International Standard (IS) having been officially published, this meeting was focused on C++20, and the various Technical Specifications (TS) we have in flight.

C++17

As mentioned, C++17 has been officially published, around the end of last year. The official published version can be purchased from ISO’s website; a draft whose technical content is identical is available free of charge here.

See here for a list of new language and library features in C++17.

The latest versions of GCC and Clang both have complete support for C++17, modulo bugs. MSVC has significant partial support, but full support is still a work in progress.

C++20

C++20 is under active development. A number of new changes have been voted into its Working Draft at this meeting, which I list here. For a list of changes voted in at previous meetings, see my Toronto and Albuquerque reports.

Technical Specifications

In addition to the C++ International Standard, the committee publishes Technical Specifications (TS) which can be thought of experimental “feature branches”, where provisional specifications for new language or library features are published and the C++ community is invited to try them out and provide feedback before final standardization.

The committee recently published four TSes – Coroutines, Ranges, Networking, and most recently, Modules – and several more are in progress.

Modules TS

The last meeting ended with the Modules TS close to being ready for a publication vote, but not quite there yet, as the Core Working Group (CWG) was still in the process of reviewing resolutions to comments sent in by national standards bodies in response to the PDTS (“Proposed Draft TS”) ballot. Determined not to leave the resolution of the matter to this meeting, CWG met via teleconference on four different occasions in between meetings to finish the review process. Their efforts were successful; in particular, I believe that the issues that I described in my last report as causing serious implementer concerns (e.g. the “views of types” issue) have been resolved. The revised document was voted for publication a few weeks before this meeting (also by teleconference).

That allowed the time during this meeting to be spent discussing design issues that were explicitly deferred until after the TS’s publication. I summarize that technical discussion below.

Parallelism TS v2

The Parallelism TS v2 has picked up one last major feature: data-parallel vector types and operations, also referred to as “SIMD”. With that in place, Parallelism TS was sent out for its PDTS ballot.

Concurrency TS v2

The Concurrency TS v2 (no working draft yet) is continuing to take shape. There’s a helpful paper that summarizes its proposed contents and organization.

A notable component of the Concurrency TS v2 that I didn’t mention in my last report is a revised version of future::then() (the original version appeared in the Concurrency TS v1, but there was consensus against moving forward with it in that form). This, however, depends on Executors, which will be published independently of the Concurrency TS v2, either in C++20 or a TS of its own.

Library Fundamentals TS v3

The Library Fundementals TS is a sort of a grab-bag TS for library proposals that are not large enough to get their own TS (like Networking did), but experimental enough not to go directly into the IS. It’s now on its third iteration, with v1 and significant components of v2 having merged into the IS.

No new features have been voted into v3 yet, but an initial working draft has been prepared, basically by taking v2 and removing the parts of it that have merged into C++17 (including optional and string_view); the resulting draft will be open to accept new proposals at future meetings (I believe mdspan (a multi-dimensional array view) and expected<T> (similar to Rust’s Result<T>) are headed that way).

Reflection TS

After much anticipation, the Reflection TS is now an official project, with its initial working draft based on the latest version of the reflexpr static introspection proposal. I believe the extensions for static reflection of functions are targeting this TS as well.

It’s important to note that the Reflection TS is not the end of the road for reflection in C++; further improvements, including a value-based (as opposed to type-based) interface for reflection, and metaclasses, are being explored (I write more about these below).

Future Technical Specifications

There are some planned future Technical Specifications that don’t have an official project or working draft yet:

Graphics

The proposal for a Graphics TS, set to contain 2D graphics primitives with an interface inspired by cairo, continues to be under discussion in the Library Evolution Working Group (LEWG).

At this meeting, the proposal has encountered some controversy. A library like this is unlikely to be used for high-performance production use cases like games and browsers; the target market is more people teaching and learning C++, and non-performance-intensive GUI applications. Some people consider that to be a poor use of committee time (it was observed that a large proposal like this would tie up the Library Working Group for one or two full meetings’ worth of wording review). On the other hand, the proposal’s authors have been “strung along” by the committee for a couple of years now, and have invested significant time into polishing the proposal to be standards-quality.

The committee plans to hold an evening session at the next meeting to decide the future of the proposal.

Executors

Executors are a important concurrency abstraction for which the committee has been trying to hash out a suitable design for a long time. There is finally consensus on a design (see the proposal and accompanying design paper), and the Concurrency Study Group had been planning to publish it in its own Technical Specification.

Meanwhile, it became apparent that several other proposals depend on executors, including Networking (which isn’t integrated with executors in its TS form, but people would like it to be prior to merging it into the IS), the planned improvements to future, and new execution policies for parallel algorithms. Coroutines doesn’t necessarily have a dependency, but there are still integration opportunities.

As a result, the Concurrency Study Group is eyeing the possibility of getting executors directly into C++20 (instead of going through a TS), to unblock dependent proposals sooner.

Merging Technical Specifications into C++20

After a TS has been published and has garnered enough implementation and use experience that the committee is confident enough to officially standardize its contents, it can be merged into the standard. This happened with e.g. the Filesystems and Parallelism TSes in C++17, and significant parts of the Concepts TS in C++20.

As the committee has a growing list of published-but-not-yet-merged TSes, there was naturally some discussion of which of these would be merged into C++20.

Coroutines TS

The Coroutines TS was proposed for merger into C++20 at this meeting. There was some pushback from adopters who tried it out and brought up several concerns (these concerns were subsequently responded to).

We had a lively discussion about this in the Evolution Working Group (EWG). I summarize the technical points below, but the procedural outcome was that those advocating for significant design changes will have until the next meeting to bring forward a concrete proposal for such changes, or else “forever hold their peace”.

Some felt that such a “deadline” is a bit heavy-handed, and I tend to agree with that. While there certainly needs to be a limit on how long we wait for hypothetical future proposals that improve on a design, the Coroutines TS was just published in November 2017; I don’t think it’s unreasonable to ask that implementers and users be given more than a few months to properly evaluate it and formulate high-quality proposals to improve it if appropriate.

Ranges TS

The Ranges TS modernizes and Conceptifies significant parts of the standard library (the parts related to algorithms and iterators).

Its merge into the IS is planned to happen in two parts: first, the foundational Concepts that a large spectrum of future library proposals may want to make use of, and then the range-based algorithms and utilities themselves. The purpose of the split is to allow the first part to merge into the C++20 working draft as soon as possible, thereby unblocking proposals that wish to use the foundational Concepts.

The first part is targeting C++20 pretty firmly; the second part is still somewhat up in the air, with technical concerns relating to what namespace the new algorithms will go into (there was previously talk of a std2 namespace to serve as a place to house new-and-improved standard library facilities, but that has since been scrapped) and how they will relate to the existing algorithms; however, the authors are still optimistic that the second half can make C++20 as well.

Networking TS

There is a lot of desire to merge the Networking TS into C++20, but the dependence on executors makes that timeline challenging. As a best case scenario, it’s possible that executors go into C++20 fairly soon, and there is time to subsequently merge the Networking TS into C++20 as well. However, that schedule can easily slip to C++23 if the standardization of executors runs into a delay, or if the Concurrency Study Group chooses to go the TS route with executors.

The remaining parts of the Concepts TS

The Concepts TS was merged into the C++20 working draft in Toronto, but without the controversial abbreviated function templates (AFTs) feature (and some related things).

I mentioned that there was still a lot of demand for AFTs, even if there was no consensus for them in their Concepts TS form, and that alternative AFT proposals targeting C++20 would be forthcoming. Several such proposals were brought forward at this meeting; I discuss them below. While there wasn’t final agreement on any of them at this meeting, there was consensus on a direction, and there is relative optimism about being able to get AFTs in some form into C++20.

What about Modules?

The Modules TS was just published a few weeks ago, so talk of merging it into the C++ IS is a bit premature. Nonetheless, it’s a feature that people really want, and soon, and so there was a lot of informal discussion about the possibility of such a merge.

There were numerous proposals for post-TS design changes to Modules brought forward at this meeting; I summarize the EWG discussion below. On the whole, I think the design discussions were quite productive. It certainly helped that the Modules TS is now published, and design concerns could no longer be postponed as “we’ll deal with this post-TS”.

I think it’s too early to speculate about the prospects of getting Modules into C++20, but there seems to be a potential path forward, which I describe below as well.

Evolution Working Group

I’ll now write in a bit more detail about the technical discussions that took place in the Evolution Working Group, the subgroup that I sat in for the duration of the week.

Unless otherwise indicated, proposals discussed here are targeting C++20. I’ve categorized them into the usual “accepted”, “further work encouraged”, and “rejected” categories:

Accepted proposals:

  • A couple of minor tweaks to the Coroutines TS: symmetric coroutine transfer, and parameter preview for coroutine promise constructor.
  • Clarifications about the behaviour of contract checks that modify observable (e.g. global) state. The outcome was that evaluating such a contract check constitutes undefined behaviour.
  • Class types in non-type template parameters. This is a long-desired feature, with an example use case being format strings checked at compile-time, and one of the few remaining gaps in the language where user-defined types don’t have all the powers of built-in types. The feature had been blocked on the issue of how to determine the equivalence of two non-type template parameters of class type (which is needed to be able to establish the equivalence of template specializations). Default comparisons finally provided a way forward here; class types used as non-type template parameters need to have a defaulted operator<=> (as do their members).
  • Static reflection of functions. This is an extension to the reflexpr proposal to allow reflecting over functions. You can’t reflect over an overload set; rather, reflexpr can accept a function call expression as an argument, perform overload resolution (without evaluating the call), and reflect the chosen overload. This is targeting the Reflection TS, not C++20.
  • Standard containers and constexpr. This proposal aims to allow the use of dynamic allocation in a constexpr context, so as to make e.g. std::vector usable by constexpr functions. This is accomplished by allowing destructors to be constexpr, and allowing new-expressions and std::allocator to be used in a constexpr context. (The latter is necessary because something like std::vector, which maintains a partially initialized dynamic allocation, can’t be implemented using new-expressions alone. operator new itself isn’t supported, because it loses information about the type of the allocated storage; std::allocator::allocate(), which preserves such information, needs to be used instead.) The proposal as currently formulated does not allow dynamic allocations to “survive” beyond constant experssion evaluation; there will be a future extension to allow this, where “surviving” allocations will be promoted to static or automatic storage duration as appropriate.
  • char8_t: a type for UTF-8 characters and strings. This is a combined core language + library proposal; the language parts include introducing a new char8_t type, and changing the behaviour of u8 character and and string literals to use that type. The latter changes are breaking, though the expected breakage is fairly slight, especially for u8 character literals which are new in C++17 and not heavily used yet.

    Discussion of this proposal centered around the big-picture plan of how UTF-8 adoption will work, and whether we can’t just work towards char itself implying a UTF-8 encoding. Several people argued that that’s unlikely to happen, due to large amounts of legacy code that don’t treat char as UTF-8, and due to the special role of char as an “aliasing” type (where an array of char is allowed to serve as the underlying storage for objects of other types) which prevents compilers from optimizing uses of char the way they could optimize char8_t (which, importantly, would be a non-aliasing type).

    In the end, EWG gave the green-light to the direction outlined in the paper. (There was a brief discussion of pursuing this as a TS, but there was no consensus for this, in part because people felt that if we’re going to change the meaning of u8 literals, we might as well do it now before the C++17 meaning gets a lot of adoption.)
  • explicit(bool). This allows constructors to be declared as “conditionally explicit”, based on a compile-time condition. This is mostly useful for wrapper types like pair or optional, where we want their constructors to be explicit iff. the constructors of their wrapped types are.
  • Checking for abstract class types. This tweaks the rules regarding when attempted use of an abstract type as a complete object is diagnosed, to avoid situations where a class definition retroactively makes a previously declared function that uses the type ill-formed.

There were also a few that, after being accepted by EWG, were reviewed by CWG and merged into the C++20 working draft the same week, and thus I already mentioned them in the C++20 section above:

Finally, EWG decided to pull the previously-approved proposal to allow string literals in non-type template parameters, because the more general facility to allow class types in non-type template parameters (which was just approved) is a good enough replacement. (This is a change from the last meeting, when it seemed like we would want both.) The main difference is that you now have to wrap your character array into a struct (think fixed_string or similar), and use that as your template parameter type. (The user-defined literal part of P0424 is still going forward, with a corresponding adjustment to the allowed template parameter types.)


Proposals for which further work is encouraged:

  • C++ stability, velocity, and deployment plans. This is a proposal for a Standing Document (SD; a less-official-than-a-standard committee document, typically with procedural rather than technical content) outlining the procedure by which breaking changes can be made to C++. It classifies breaking changes by level of detectability (e.g. statically detectable and causes a compiler error, statically detectable but doesn’t cause a compiler error, not statically detectable), and issues guidance for whether and how changes in each category can be made. EWG encouraged the authors to come back with specific wording for the proposed SD.
  • Standard library compatibility promises. This is another proposal for a Standing Document, outlining what compatibility promises the C++ standard library makes to its users, and what kind of future changes it reserves to make. (As an example, the committee reserves the right to add new overloads to standard library functions. This may break user code that tries to take the address of a standard library function, and we want to make it clear that such breakage is par for the course; if you want a guarantee that your code will compile without modifications in future standards, you can only call standard library functions, not take their address.)
  • LEWG wishlist for EWG. This is a wishlist of core language issues that the Library Evolution Working Group would like to see addressed to solve problems facing library authors and users. Some of the items included reining in overeager ADL (see below for a proposal to do just that), making it easier to avoid lifetime errors, dealing with ABI breakage, and finding alternatives for the remaining use cases of macros. EWG encouraged future proposals in these areas, or discussion papers that advance our understanding of the problem (for example, a survey of macro use cases that don’t have non-macro alternatives).
  • Extending the offsetof macro to allow computing the offset to a member given a pointer-to-member variable (currently it requires being given the member’s name). EWG thought this was a valid use case, but expressed a preference for a different syntax rather than overloading the offsetof macro.
  • Various proposed extensions to the Modules TS, which I talk about below.
  • Towards consistency between <=> and other comparison operators. The background to this proposal is that when the <=> operator was introduced, there were a few cases where the specified behaviour was a departure from the corresponding behaviour for the existing two-way comparison operators. These were cases where we would have liked to change the behaviour for the existing operators, but couldn’t due to backwards compatibility considerations. <=>, however, being new to the language, had no such backwards compatibility considerations, so the authors specified the more-desirable behaviour for it. The downside is that this introduced inconsistencies between <=> and the two-way comparison operators.

    This proposal aims to resolve those inconsistencies, in some cases by changing the behaviour of the two-way operators after all. There were five specific areas of change:

    • Sign safety. Today, -1 < 1u evaluates to false due to sign conversion, which is not the mathematically correct result. -1 <=> 1u, on the other hand, is a compiler error. EWG decided that both should in fact work and give the mathematically correct result (which for -1 < 1u is a breaking change, though in practice it’s likely to fix many more bugs than it introduces), though whether this will happen in C++20, or after a longer transition period, remains to be decided.
    • Enum safety. Today, C++ allows two-way comparisons between enumerators of distinct enumerator types, and between enumerators and floating-point values. Such comparisons with <=> are ill-formed. EWG felt they should be made ill-formed for two-way comparisons as well, though again this may happen by first deprecating them in C++20, and only actually making them ill-formed in a future standard. (Comparisons between enumerators and integer values are common and useful, and will be permitted for all comparison operators.)
    • Array safety. Two-way comparisons between operands of array type will be deprecated.
    • Null safety. This is just a tweak to make <=> between a pointer and nullptr return strong_equality rather than strong_ordering.
    • Function pointer safety. EWG expressed a preference for allowing all comparisons between function pointers, and requiring implementers to impose a total order on them. Some implementers indicated they need to investigate the implementability of this on some architectures and report back.
  • Chaining comparisons. This proposes making chains of comparisons, such as a == b == c or a < b <= c, have their expected mathematical meaning (which is currently expressed in C++ in a more cumbersome way, e.g. a == b && b == c). This is a breaking change, since such expressions currently have a meaning (evaluate the first comparison, use its boolean result as the value for the second comparison, and so on). It’s been proposed before, but EWG was worried about the silent breaking change. Now, the authors have surveyed a large body of open-source code, and found zero instances of such expressions where the intended meaning was the current meaning, but several instances where the intended meaning was the proposed meaning (and which would therefore be silently fixed by this proposal). Importantly, comparison chains are only allowed if the comparisons in the chain are either all =, all < and <=, or all > and >=; other chains like a < b > c are not allowed, unlike e.g in Python. In the original proposal, such “disallowed” chains would have retained their current meaning, but EWG asked that they be made ill-formed instead, to avoid confusion. The proposal also contained a provision to have folds over comparisons (e.g. a < ..., where a is a function parameter pack) expand to a chained comparison, but EWG chose to defer that part of the proposal until more implementation experience can be gathered.
  • Size feedback in operator new. This proposes overloads of operator new that return how much memory was allocated (which may be more than what was asked for), so the caller can make use of the entire allocation. EWG agreed with the use case, but had some concerns about the explosion of operator new overloads (each new variation that’s added doubles the number of overloads; with this proposal, it would be 8), and the complications around having the new overloads return a structure rather than void*, and asked the authors to come back after exploring the design space a bit more.
  • The assume_aligned attribute. The motivation is to allow authors to signal to the compiler that a variable holds a value with a particular alignment at a given point in time, for purposes such as more efficient vectorization. The alignment is a property of the variable’s value at a point in time, not of the variable itself (e.g. you can subsequently increment the pointer and it will no longer have that alignment). EWG liked the idea but felt that the proposed semantics about where the attribute could apply (for example, that it could apply to parameter variables but not local variables) were confusing. Suggested alternatives included a magic library function (which would more clearly apply at the time it’s called), and something you can place into a contract check.
  • Fixing ADL. This is a resurrection of a proposal that’s more than a decade old, to fix argument-dependent lookup (ADL). ADL often irks people because it’s too eager, and often finds overloads in other namespaces that you didn’t intend. This proposal to fix it was originally brought forward in 2005, but was deferred at the time because the committee was behind in shipping C++0x (which became C++11); it finally came back now. It aims to make two changes to ADL:
    • Narrow the rules for what makes a namespace an associated namespace for the purpose of ADL. The current rules are very broad; in particular, it includes not only the namespaces of the arguments of a function call, but the namespaces of the template parameters of the arguments, which is responsible for a lot of unintended matches. The proposal would axe the template parameters rule.
    • Even if a function is found in an associated namespace, only consider it a match if it has a parameter matching the argument that caused the namespace to be associated, in the relevant position.

    This is a scary change, because it has the potential to break a lot of code. EWG’s main feedback was that the authors should try implementing it, and test some large codebases to understand the scope of breakage. There were also some concerns about the how the second change would interact with Concepts (and constrained templates in general). The proposal will come back for further review.

  • A proposed language-level mitigation for Spectre variant 1, which I talk about below.
  • Allow initializing aggregates from a parenthesized list of values. This aims to solve a long-standing issue where e.g. vector::emplace() didn’t work with aggregate types, because the implementation of emplace() would do new T(args...), while aggregates required new T{args...}. A library solution was previously proposed for this, but the library groups were unhappy with it because it felt like a workaround for a language deficiency, and it would have had to be applied everywhere in the library where it was a problem (with vector::emplace() being just one example). This proposal fixes the deficiency at the language level. EWG generally liked the idea, though there was also a suggestion that a related problem with aggregate initialization (deleted constructors not preventing it) be solved at the same time. There was also a suggestion that the proposal only apply in dependent contexts (since in non-dependent contexts, you know what kind of initialization you need to use), but that was shot down.
  • Signed integers are two’s complement. The standard currently allows various representations for signed integers, but two’s complement is the only one used in practice, on all modern architectures; this proposal aims to standardize on that, allowing code to portably rely on the representation (and e.g. benefit from hardware capabilities like an arithmetic right shift). EWG was supportive of the idea, but expressed a preference for touching base with WG14 (the C standards committee) to make sure they’re on board with this change. (The original version of this proposal would also have defined the overflow behavior for signed integers as wrapping; this part was rejected in other subgroups and never made it to EWG.)
  • Not a proposal, but the Core Working Group asked EWG whether non-template functions should be allowed to be constrained (with a requires-clause). There are some use cases for this, such as having multiple implementations of a function conditioned on some compile-time condition (e.g. platform, architecture, etc.). However, this would entail some specification work, as the current rules governing overloading of constrained functions assume they are templates, and don’t easily carry over to non-templates. EWG opted not to allow them until someone writes a paper giving sufficient motivation.

Rejected proposals:

  • Supporting offsetof for all classes. offsetof is currently only guaranteed to work for standard-layout classes, but there are some use cases for it related to memory-mapped IO, serialization, and similar low-level things, that require it to work for some classes that aren’t standard-layout. EWG reiterated the feedback it gave on the previous proposal on this topic: to expand the definition of standard-layout to include the desired types. EWG was disinclined to allow offsetof for all classes, including ones with virtual bases, as proposed in this paper; it was felt that this more general goal could be accomplished with a future reflection-based facility.
  • Structured bindings with polymorphic lambdas. This would have allowed a structured binding declaration (e.g. auto [a, b]) as a function parameter, with the semantics that it binds to a single argument (the composite object), and is decomposed into the named consituents on the callee side. EWG sympathized with the goal, but had a number of concerns including visual ambiguity with array declarators, and encouraging the use of templates (and particularly under-constrained templates, until structured bindings are extended to allow a concept in place of auto) where otherwise you might use a non-template.
  • Structured binding declaration as a condition. This would have allowed a condition like if (auto [a, b] = f()), where the condition evaluates to the composite object returned to f() (assuming that object is already usable as a condition, e.g. by having a conversion operator to bool). EWG felt that the semantics weren’t obvious (in particular, people might think one of the decomposed variables is used as the condition). There were also unanswered questions like, in the case of a composite object that uses get<>() calls to access the decomposed variables, whether those calls happen before or after the call to the conversion operator. It was pointed out that you can already use a structured binding in a condition if you use the “if with initializer” form added in C++17, e.g. if (auto [result, ok] = f(); ok), and this is preferable because it makes clear what the condition is. (Some people even expressed a desire for deprecating the declaration-as-condition form altogether, although there was also opposition to that.)

Spectre

No significant meeting of software engineers in the past few months has gone without discussion of Spectre, and this standards meeting was no exception.

Google brought forward a proposal for a language-level mitigation for variant #1 of Spectre (which, unlike variant #2, has no currently known hardware-level mitigation). The proposal allows programmers to harden specific branches against speculation, like so:


  if [[protect_from_speculation(args...)]] (predicate) {
    // use args
  }

args... here is a comma-separated list of one or more variables that are in scope. The semantics is that, if predicate is false, any speculative execution inside the if block treats each of the args as zero. This protects against the exploit, which involves using side channels to recover information accessed inside (misspeculated execution of) the branch at a location that depends on args.

The described semantics can be implemented in assembly; see this llvm-dev post for a description of the implementation approach.

For performance reasons, the proposed hardening is opt-in (as opposed to “harden all branches this way”, although compilers can certainly offer that as an option for non-performance-critical programs), and only as aggressive as it needs to be (as opposed to “disable speculation entirely for this branch”).

The language-level syntax to opt a branch into the hardening remains to be nailed down; the attribute syntax depicted above is one possibility. One complication is that if statements are not the only language constructs that compile down to branches; there are others, including some subtler ones like virtual function dispatch. The chosen syntax should be flexible enough to allow hardening all relevant constructs.

In terms of standardizing this feature, one roadblock is that the C++ standard defines the behavior of programs in terms of an abstract machine, and the semantics of the proposed hardening concern lower-level notions that cannot be described in such terms. As the committee is unlikely to reinvent the C++ abstract machine to allow reasoning about such things as speculative execution in normative wording, it may end up being the case that the syntax of the language construct is described normatively, while its semantics is described non-normatively.

This proposal will return to EWG in a more concrete form at the next meeting. As portably mitigating Spectre is a rather urgent desire in the C++ community, there was some talk of somehow standardizing this feature “out of band” rather than waiting for C++20, though it wasn’t clear what that might look like.

Concepts

EWG had an evening session to discuss proposals related to Concepts, particularly abbreviated function templates (AFTs).

To recap, AFTs are function templates declared without a template parameter list, with concept names used instead of type names in the signature. An example is void sort(Sortable& s);, which is a shorthand for template <Sortable __S> void sort(__S& s);. Such use of a concept name in place of a type name is called a constrained-type-specifier. In addition to parameter types, the Concepts TS allowed constrained-type-specifiers in return types (where the meaning was “the function’s return type is deduced, but also has to model this concept”), and in variable declarations (where the meaning was “the variable’s type is deduced, as if declared with auto, but also has to model this concept”).

constrained-type-specifiers did not make it into C++20 when the rest of the Concepts TS was merged, mostly because there were concerns that you can’t tell apart an AFT from a non-template function without knowing whether the identifiers that appear in the parameter list name types or concepts.

Four proposals were presented at this evening session, which aimed to get AFTs and/or other forms of constrained-type-specifiers into C++20 in some form.

I’ll also mention that the use of a concept name inside a template parameter list, such as template <Sortable S> (which is itself a shorthand for template <typename S> requires Sortable<S>), is called a constrained-parameter. constrained-parameters have been merged into the C++20 working draft, but some of the proposals wanted to make modifications to them as well, for consistency.

Three of the discussed proposals took the approach of a inventing a new syntax for constrained-type-specifiers (and in some cases constrained-parameters) that wasn’t just an identifier, thus syntactically distinguishing AFTs from non-template functions.

  • Concept-constrained auto proposed the syntax auto<Sortable>. The proposal as written concerned variable declarations only, but one could envision extending this to other uses of constrained-type-specifiers.
  • An adjective syntax for concepts proposed Sortable typename S as an alternative syntax for constrained-parameters, with a possible future extension of Sortable auto x for constrained-type-specifiers. The idea is that the concept name is tacked, like an adjective, onto the beginning of what you’d write without concepts.
  • Concepts in-place syntax proposed Sortable{S} for constrained-parameters, and Sortable{S} s for constrained-type-specifiers (where S would be an additional identifier the declaration introduces, that names the concrete type deduced for the parameter/variable). You could also write Sortable{} s if you didn’t want/need to name the type. One explicit design goal of this proposal was that if, in the future, the committee changes its mind about AFTs needing to be syntactically distinguishable from non-template functions (because we get more comfortable with them, or are happy to rely more on tooling to tell them apart), the empty braces could be dropped altogether, and we’d arrive precisely at the Concepts TS syntax.

An additional idea that was floated, though it didn’t have a paper, was to just use the Concepts TS syntax, but add a single syntactic marker, such as a bare template keyword before the function declaration (as opposed to per-parameter syntactic markers, as in the above proposals).

Of these ideas, Sortable{S} had the strongest support, with “Concepts TS syntax + single syntatic marker” coming a close second. The proponents of these ideas indicated that they will try to collaborate on a revised proposal that can hopefully gain consensus among the entire group.

The fourth paper that was discussed attacked the problem from a different angle: it proposed adopting AFTs into C++20 without any special syntactic marker, but also changing the way name lookup works inside them, to more closely resemble the way name lookup works inside non-template functions. The idea was that, perhaps if the semantics of AFTs are made more similar to non-template functions (name lookup is one of the most prominent semantic differences between template and non-template code), then we don’t need to syntactically distinguish them. The proponents of having a syntactic marker did not find this a convincing argument for adopting AFTs without one, but it was observed that the proposed name lookup change might be interesting to explore independently. At the same time, others pointed out similarities between the proposed name lookup rules and C++0x concepts, and warned that going down this road would lead to C++0x lookup rules (which were found to be unworkable).

(As an aside, one topic that seems to have been settled without much discussion was the question of independent resolution vs. consistent resolution; that is, if you have two uses of the same concept in an AFT (as in void foo(Number, Number);), are they required to be the same concrete type (“consistent”), or two potentially different types that both model the concept (“independent”). The Concepts TS has consistent resolution, but many people prefer independent resolution. I co-authored a paper arguing for independent resolution a while back; that sentiment was subsequently reinforced by another paper, and also in a section of the Sortable{S} proposal. Somewhat to my amusement, the topic was never actually formally discussed and voted on; the idea of independent resolution just seemed to slowly, over time, win people over, such that by this meeting, it was kind of treated as a done deal, that any AFT proposal going into C++20 will, in fact, have independent resolution.)

Coroutines

As mentioned above, EWG had a discussion about merging the Coroutines TS into C++20.

The main pushback was due to a set of concerns described in this paper (see also this response paper). The concerns fell into three broad categories:

  • Performance concerns. As currently specified, coroutines perform a dynamic allocation to store the state that needs to be saved in between suspensions. The dynamic allocation can be optimized away in many cases, but it was argued that for some use cases, you want to avoid the dynamic allocation by construction, without relying on your optimizer. An analogy can be made to std::vector: sure, compilers can sometimes optimize the dynamic allocation it performs to be a stack allocation, but we still have stack arrays in the language to guarantee stack allocation.

    One particularly interesting use case that motivates this performance guarantee, is using coroutines to implement a form of error handling similar to Rust’s try! macro / ? operator. The general idea is to hook the coroutine customization points for a type like expected<T> (the proposed C++ analogue of Rust’s Result), such that co_await e where e has type expected<T> functions like try!(e) would in Rust (see the paper for details). However, no one would contemplate using such an error handling mechanism if it didn’t come with a guarantee of not introducing a dynamic allocation.
  • Safety concerns. The issue here is that reference parameters to a coroutine may become dangling after the coroutine is suspended and resumed. There is a desire to change the syntax of coroutines to make this hazard more obvious.
  • Syntax concerns. There are several minor syntactic concerns related to the choice of keywords (co_await, co_yield, and co_return), having to use co_return instead of plain return, and the precedence of the co_await operator. There is a suggestion to address these by replacing co_await with a punctuation-based syntax, with both prefix and postfix forms for better composition (compare having both * and -> operators for pointer dereferencing).

The paper authors plan to bring forward a set of modifications to the Coroutines TS that address these concerns. I believe the general idea is to change the syntax in such a way that you can explicitly access / name the object storing the coroutine state. You can then control whether it’s allocated on the stack or the heap, depending on your use case (e.g. passing it across a translation unit boundary would require allocating it on the heap, similar to other compiler-generated objects like lambdas).

EWG expressed interest in seeing the proposed improvements, while also expressing a strong preference for keeping coroutines on track to be merged into C++20.

Modules

EWG spent an entire day on Modules. With the Modules TS done, the focus was on post-TS (“Modules v2”) proposals.

  • Changing the term “module interface”. This paper argued that “module interface” was a misnomer because a module interface unit can contain declarations which are not exported, and therefore not conceptually part of the module’s interface. No functional change was proposed. EWG’s reaction was “don’t care”.
  • Modules: dependent ADL. The current name lookup rules in the Modules TS have the consequence that argument-dependent lookup can find some non-exported functions that are declared in a module interface unit. This proposal argued this was surprising, and suggested tightening the rules. EWG was favourable, and asked the author to come back with a specific proposal.
  • Modules: context-sensitive keyword. This proposed making module a context-sensitive keyword rather than a hard keyword, to avoid breaking existing code that uses module as an identifier. The general approach was that if a use of module could legally be a module declaration, it is, otherwise it’s an identifier. EWG disliked this direction, because the necessary disambiguation rules were too confusing (e.g. two declarations that were only subtly different could differ in whether module was interpreted as a keyword or an identifier). It was suggested that instead an “escape mechanism” be introduced for identifiers, where you could “decorate” an identifier as something like __identifier(module) or @module to keep it an identifier. It was also pointed out that adopting relevant parts of the “Another take on modules” proposal (see below) would make this problem moot by restricting the location of module declarations to a file’s “preamble”.
  • Unqualified using declarations. This proposed allowing export using name;, where name is unqualified, as a means of exporting an existing name (such as a name from an included legacy header). EWG encouraged exploration of a mechanism for exporting existing names, but wasn’t sure this would be the right mechanism.
  • Identifying module source code. This requires that any module unit either start with a module declaration, or with module; (which “announces” that this is a module unit, with a module declaration to follow). The latter form is necessary in cases where the module wants to include legacy headers, which usually can’t be included in the module’s purview. This direction was previously approved by EWG, and this presentation was just a rubber-stamp.
  • Improvement suggestions to the Modules TS. This paper made several minor improvement suggestions.
    • Determining whether an importing translation unit sees an exported type as complete or incomplete, based on whether it was complete or incomplete at the end of the module interface unit, rather than at the point of export. This was approved.
    • Exporting the declaration of an inline function should not implicitly export the definition as well. There was no consensus for this change.
    • Allow exporting declarations that don’t introduce names; an example is a static_assert declaration. Exporting such a declaration has no effect; the motivation here is to allow enclosing a group of declarations in export { ... }, without having to take care to move such declarations out of the block. This was approved for static_assert only; EWG felt that for certain other declarations that don’t introduce names, such as using-directives, allowing them to be exported might be misleading.
    • A tweak to the treatment of private members of exported types. Rejected because private members can be accessed via reflection.

That brings us to what I view as the most significant Modules-related proposal we discussed: Another take on modules (or “Atom” for short). This is a proposal from Google based on their deployment experience with Clang’s implementation of Modules; it’s a successor to previous proposals like this one. It aims to make several changes – some major, some minor – to the Modules TS; I won’t go through all of them here, but they include changes to name lookup and visibility rules, support for module partitions, and introducing the notion of a “module preamble”, a section at the top of a module file that must contain all module and import declarations. The most significant change, however, is support for modularized legacy headers. Modularized legacy headers are legacy (non-modular) headers included in a module, not via #include as in the Modules TS, but via import (as in import "file" or import <file>). The semantics is that, instead of textually including the header contents as you would with an #include, you process them as an isolated translation unit, produce a module interface artefact as-if it was a module (with all declarations exported, I assume), and then process the import as if it were an actual module import.

Modularized legacy headers are primarily a transition mechanism for incrementally modularizing a codebase. The proposal authors claim that without them, you can’t benefit from compile-time improvements of Modules in a codebase (and in fact, you can take a compile time hit!) unless you bottom-up modularize the entire codebase (down to the standard library and runtime library headers), which is viewed as infeasible for many large production codebases.

Importantly, modularized legacy headers also offer a way forward in the impasse about whether Modules should support exporting macros. In the Atom proposal, modularized legacy headers do export the macros they define, but real modules do not. (There is an independent proposal to allow real modules to selectively export specific macros, but for transition purposes, that’s not critical, since for components that have macros as part of their interface, you can just use them as a modularized legacy header.)

There was some discussion of whether the Atom proposal is different enough from the Modules TS that it would make sense to pursue it as a separate (competing) TS, or if we should try to integrate the proposed changes into the Modules TS itself. The second approach had the stronger consensus, and the authors plan to come back with a specific proposed diff against the Modules TS.

It’s too early to speculate about the impact of pursuing these changes on the schedule for shipping Modules (such as whether it can be merged into C++20). However, one possible shipping strategy might be as follows (disclaimer: this is my understanding of a potential plan based on private conversation, not a plan that was approved by or even presented to EWG):

  • Modules v1 is the currently shipping Modules TS. It is not forward-compatible with v2 or v3.
  • Modules v2 would be a modified version of v1 that would not yet support modularized legacy headers, but would be forward-compatible with v3. Targeting C++20.
  • Modules v3 would support modularized legacy headers. Targeting post-C++20, possibly a second iteration of the Modules TS.

Such a way forward, if it becomes a reality, would seem to satisfy the concerns of many stakeholders. We would ship something in the C++20 IS, and people who are able to bottom-up modularize their codebases can start doing so, without fear of further breaking changes to Modules. Others who need the power of modularized legacy headers can wait until Modules v3 to get it.

I’m pretty happy with the progress made on Modules at this meeting. With the Atom proposal having been discussed and positively received, I’m more optimistic about the feature than I have been for the past few meetings!

Papers not discussed

With the meeting being fairly heavily focused on large proposals like Concepts, Modules, and Coroutines, there were a number of others that EWG didn’t get a chance to look at. I won’t list them all (see the pre-meeting mailing for a list), but I’ll call out two of them: feature-test macros are finally on the formal standards track, and there’s an revised attempt to tackle named arguments in C++ that’s sufficiently different from previous attempts that I think it at least might not be rejected out of hand. I look forward to having these, and the other proposals on the backlog, discussed at the next meeting.

Other Working Groups

Library Groups

Having sat in EWG all week, I can’t report on technical discussions of library proposals, but I’ll mention where various proposals are in the processing queue.

I’ve already listed the library proposals that passed wording review and were voted into the C++20 working draft above.

A few proposals targeting Technical Specifications also passed wording review and were merged into the relevant TS working drafts:

The following proposals are still undergoing wording review:

The following proposals have passed design review and await wording review at future meetings:

The following proposals are still undergoing design review:

In addition, there is a fairly long queue of library proposals that haven’t started design review yet. See the committee’s website for a full list of proposals.

Finally, I’ll mention that the Library Evolution Working Group had a joint evening session with SG 14 (Low Latency Programming) to discuss possible new standard library containers in C++20. Candidates included a fixed capacity vector, a vector with a small object optimization, ring buffer, colony, and slot map; the first three had the greatest support.

Study Groups

SG 6 (Numerics)

SG 6 met for a day, and reviewed a number of numerics-related proposals. In addition to the “signed integers are two’s complement” proposal that later came to EWG, it looked at several library proposals. Math constants, constexpr for <cmath> and <cstdlib>, letting strong_order truly be a customization point, and interpolation were forwarded to LEWG (in some cases with modifications). More better operators and floating point value access for std::ratio remain under discussion. Safe integral comparisons have been made moot by operator<=> (the proposal was “abducted by spaceship”).

SG 7 (Compile-Time Programming)

SG 7, the Compile-Time Programming (previously Reflection) Study Group, met for an evening session and reviewed three papers.

The first, called constexpr reflexpr, was an exploration of what the reflexpr static introspection proposal might look like formulated in terms of value-based constexpr programming, rather than template metaprogramming. SG 7 previously indicated that this is the direction they would like reflection proposals to take in the longer term. The paper was reviewed favourably, with encouragement to do further work in this direction. One change that was requested was to make the API value-based rather than pointer based. Some implementers pointed out that unreflexpr, the operator that takes a meta-object and reifies it into the entity it represents, may need to be split into multiple operators for parsing purposes (since the compiler needs to know at parsing time whether the reified entity is a value, a type, or a template, but the meta-object passed as argument may be dependent in a template context). Finally, some felt that the constexpr for facility proposed in the paper (which bears some resemblance to the previously-proposed tuple-based for loop) may be worth pursuing independently.

The second was a discussion paper called “What do we want to do with reflection?” It outlines several basic / frequently requested reflection use cases, and calls for facilities that address these use cases to be added to C++20. SG 7 observed that one such facility, source code information capture, is already shipping in the Library Fundamentals TS v2, and could plausibly be merged into C++20, but for the rest, a Reflection TS published in the 2019-2020 timeframe is probably the best we can do.

The third was an updated version of the metaclasses proposal. To recap, metaclasses are compile-time transformations that can be applied to a class definition, producing a transformed class (and possibly other things like helper classes / functions). At the last meeting, SG 7 discussed how a metaclass should be defined, and decided on it operating at the “value level” (where the input and output types are represented as meta-objects, and the metaclass itself is more or less just a constexpr function). At this meeting, SG 7 focused on the invocation syntax: how you apply a metaclass to your class. The syntax that appeared to have the greatest consensus was class<interface> Foo { ... }; (where interface is an example metaclass name).

SG 15 (Tooling)

This week was the inaugural meeting of the new Tooling Study Group (SG 15), also in an evening session.

Unsurprisingly, the meeting was well attended, and the people there had many, many different ideas for how C++ tooling could be improved, ranging from IDEs, through refactoring and code analysis tools, to build systems and package managers. Much of the meeting was spent trawling through this large idea space to try to narrow down and focus the group’s scope and mission.

One topic of discussion was, what is the best representation of code for tools to consume? Some argued that the source code itself is the only sufficiently general and powerful representation, while others were of the opinion that a more structured, easy-to-consume representation would be useful, e.g. because it would avoid every tool that consumes it being (or containing / invoking) a C++ parser. It was pointed out that the “binary module interface” representation that module files compile into may be a good representation for tools to consume, and we may want to standardize it. Others felt that instead of standardizing the representation, we should standardize an API for accessing it.

In the space of build systems and package managers, the group recognized that building “one build system” or “one package manager” to rule them all is unlikely to happen. Rather, a productive direction to focus efforts might be some sort of protocol that any build or package system can hook into, and produce some sort of metadata that different tools can consume. Clang implementers pointed out that compilation databases are a primitive form of this, but obviously there’s a lot of room for improvement.

In the end, the group articulated a mission: that in 10 years’ time, it would like the C++ community to be in a state where a “compiler-informed” (meaning, semantic-level) code analysis tool can run on a significant fraction of open-source C++ code out there. This implies having some sort of metadata format (that tells the tool “here’s how you run on this codebase”) that a significant enough fraction of open-source projects support. One concrete use case for this would be the author of a C++ proposal that’s a breaking change, to run a query on open-source projects to see how much breakage the change would cause; but of course the value of such infrastructure / tooling goes far beyond this use case.

It’s a fair question to ask what the committee’s role is in all this. After all, the committee’s job is to standardize the language and its libraries, and not peripheral things like build tools and metadata formats. Even the binary module interface format mentioned above couldn’t really be part of the standard’s normative wording. However, a format / representation / API could conceivably be published in the form of a Standing Document. Beyond that, the Study Group can serve as a place to coordinate development and specification efforts for various peripheral tools. Finally, the Standard C++ Foundation (a nonprofit consortium that contributes to the funding of some commitee meetings) could play a role in funding critical tooling projects.

New Study Group: SG 16 (Unicode)

The committe has decided to form a new study group for Unicode and Text Handling. This group will take ownership of proposals such as std::text and std::text_view (types for representing text that know their encoding and expose functions that operate at the level of code points and grapheme clusters), and other proposals related to text handling. The first meeting of this study group is expected to take place at a subsequent committee meeting this year.

Conclusion

I think this was a productive meeting with good progress made on many fronts. For me, the highlights of the meeting included:

  • Tackling important questions about Modules, such as how to transition large existing codebases, and what to do about macros.
  • C++20 gaining foundational Concepts for its standard library, with the rest of the Ranges TS hopefully following soon.
  • C++20 gaining a standard calendar and timezone library
  • An earnest design discussion about Coroutines, which may see an improved design brought forward at the next meeting.

The next meeting of the Committee will be in Rapperswil, Switzerland, the week of June 4th, 2018. Stay tuned for my report!

Other Trip Reports

Some other trip reports about this meeting include Vittorio Romeo’s, Guy Davidson’s (who’s a coauthor of the 2D graphics proposals, and gives some more details about its presentation), Bryce Lelbach’s, Timur Doumler’s, Ben Craig’s, and Daniel Garcia’a. I encourage you to check them out as well!

Advertisements

8 thoughts on “Trip Report: C++ Standards Meeting in Jacksonville, March 2018

    1. Please see the sentence just before the introduction: “A few links in this blog post may not resolve until the committee’s post-meeting mailing is published (expected within a few days of April 2, 2018). If you encounter such a link, please check back in a few days.”

      These are examples of such links.

  1. Thanks for the trip report Botond.

    What is your opinion on the proposed changes to the Modules TS, in particular the features related to “legacy headers”? To be honest, I’m not sure how I feel about “import” effectively being a preprocessor macro (should it be #import?), and also about syntax like “#export”.

    1. I think the “modularized legacy headers” support in the “Atom” proposal is critical for transitioning large existing code bases to Modules. For example, for a large codebase like Mozilla’s, I think it would make the difference between being able to transition to modules within a few years, vs. a couple of decades.

      The semantics of a modularized legacy header import are sufficiently different from a textual include that it needs a distinct syntax (for example, it does not “see” macros defined in a textual include above it). I think the proposed import "foo.h"; syntax is reasonable, being as it is distinct from both true module imports (import foo;) and textual includes (#include "foo.h"). Yes, #import "foo.h" has been floated; I think that would be fine too.

      1. I agree completely.

        And the fact this is an “afterthought” with regards to timing, is a great example of why C++ isn’t generally used for new projects. No one can wait 3+ years for the committee to “get it right” when the real world is forcing the pace of change orders of magnitude faster.

        Which is sad.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s