[PROPOSAL] Khronos Avatar Extensions - Phase 1 #2512

Kjakubzak · 2025-07-28T17:47:14Z

Pull Request Draft: KHR Character and Avatar Extension Set – Phase 1

Summary

This PR introduces the initial suite of KHR_character and KHR_character_avatar extensions, a collaborative effort between the Khronos 3D Formats Working Group and the VRM Consortium (VRMC). These extensions aim to provide a structured and interoperable foundation for representing character and avatar models in the glTF ecosystem across platforms, runtimes, and tooling pipelines.

Motivation

Characters and Avatars have emerged as a core content primitive in real-time applications including gaming, virtual reality, social communication, streaming, and telepresence. Yet, there has been a lack of standards around them to express key character/avatar-specific behaviors such as:

Denoting a model is, in fact, a Character or Avatar.
Facial expressions/visemes
Skeleton/Rig utilities
Per-mesh metadata annotations
Model runtime rendering characteristics (e.g. first person versus third person)

To address this, the Khronos 3D Formats Working Group and VRMC have collaboratively designed a modular set of extensions for avatar assets. With these, we aim to provide creators and developers with a standard representing an expectation in data and functionality per-avatar that can be used across platforms; like building blocks.

Characters
A 3D model representing a potentially interactable, controllable, and/or generally animatable entity.

Examples include user avatars, characters in animated entertainment, NPCs in games controlled by behavior systems, virtual agents embodying a character, etc.

Avatars
A type of Character which is embodied and controlled by a user, representing that user’s identity.

Examples include user-driven characters in third-and-first person experiences, VR scenarios with full-body embodied avatars, 2D telecommunication scenarios with embodied characters representing other users, etc.

Phase 1 Scope

The extensions in this PR represent the Phase 1 Extensions (and more); as outlined in the Khronos Avatar Extensions Working Document.

Core Extensions

KHR_character – Root-level flag denoting a character model glTF asset.

Update 12/12 - The below was delayed to a future extension set proposal

KHR_character_avatar – Extension built on top of KHR_character denoting a model is intended for use as an avatar.

Expression Extensions

Expressions in this context describe face-localized animations used to drive small and/or larger movements across the face and/or down-chain meshes needed for reasonable conveyance of emotion/intent.

For examples of relevant types of expressions, you can reference concepts such as:

Visemes (visual representations of mouth movements for parts of speech)
FACS (Facial Action Coding System; which is a system intended to describe visually distinguishable facial movements)
VRM's Expression presets [reference here] (which express larger "emotion-driven" facial units)

Expression Extensions - Core

All rely on standard glTF animations, and target different control domains:

KHR_character_expression_morphtargets
KHR_character_expression_joint
KHR_character_expression_texture

Character expressions conform to 0-to-1 float values on top-level as ‘drivers’ in a similar way to how morph targets are typically used.

Any key-framed animation (whether joint, morph target, or animation_pointer) still relies on the 0.0-to-1.0 float property; which then interpolates between the N keyframes (e.g. For N keyframes, treating the 0-index keyframe as 0.0, and the Nth keyframe as the 1.0 state. For 1 keyframe, it interpolates between the target model property at rest and that keyframe).

Regardless of the above, we still want to respect creator-defined default values for the morph targets. This may result in some clipping/mesh weirdness when being driven in an experience, but it’s the trade that an avatar creator makes.

For properties represented in the animations not covered by the animation expression extension type:

For each expression-mapped animation; it checks what extensions are present to inform what channels are expected to be animated or not.

As an example, if an animation contains a weight channel, but there’s no *expression_morphtargets extension; the expectation would be that it won’t animate that channel.

Expression Extensions - Metadata-adjacent

KHR_character_expression_procedural - Provides context whether an expression expects to be driven procedurally or not (and then, if so, what the creator's preferred method of handling it is)
KHR_character_expression_mapping – Common expression vocabulary normalization

Skeleton/Rig Extensions

KHR_character_bindpose – Declares original skeletal bindpose metadata
KHR_character_skeleton_mapping – Maps joints between arbitrary rigs (1:1)

Update 12/12 - The below was delayed to a future extension set proposal

KHR_character_skeleton_biped – Declarative semantic labeling of a bipedal skeleton

General Extensions

The below are added as part of this as they provide value to characters and avatars, but are not tied enough to be in the direct namespace. We are still presenting them as part of phase 1 due to our belief they are a net-positive add.

Mesh Annotation Extensions

Update 12/12 - The below were delayed to a future extension set proposal

KHR_mesh_annotation – General-purpose semantic tags per mesh primitive
KHR_mesh_annotation_renderview – Describes render-time visibility for first and third person view modes

Virtual Transform Extension

KHR_virtual_transform – Runtime targets for attach points, look-at control, etc.

Design Principles/Philosophies

Modular and layered: Aiming for extensions in similar categories to be built on each other where it makes sense, and independent of one another where it doesn’t. Layering on top of the baseplate extensions (KHR_character for character specific and KHR_character_avatar for avatar-specific functionality) where possible when the functionality makes sense only in the context of a character and/or avatar.
Aiming to be compatible with VRM and other avatar ecosystems, with an overall goal in this phase to not unnaturally force existing avatar systems to conform to vocabularies and hierarchies.
Enabling a self-describing character/avatar - extensions indicate what data types are contained and need to be driven to animate/embody/power the character upon being loaded. Additionally, mapping extensions and other proposed metadata extensions are used to assist in enabling the character’s general compatibility with a loaded-into runtime.
Recurring patterns - For example, with the expressions extensions the goal was to create a recurring expectation as to how to access the animation channels/fields. With these extensions (and future ones), all expression channels utilize glTF’s animation model (weights, rotation, translation, scale, and KHR_animation_pointer for other properties).
Enabling VRM to adopt these extensions out of the gate. We have a longer-term convergence plan, and VRM utilizing the proposed extensions here as needed is part of it.

Future Work

Phase 2 will contain more sets of functionality that the community will help inform! Right now, here are some topic areas we're thinking of:

LOD management and assignment for mesh and skeleton data
Camera position recommendations
Runtime clothing customization / mesh switching
Introducing a new Material Model for Toon-like avatars
Working out the physically-driven secondary motion story with VRMC and the upcoming physics extensions.
Material-based Expressions
Animations/Emotes

In parallel to Phase 1 and Phase 2, we’re currently engaging with the Metaverse Standards Forum and AOUSD in order to start conversations around potential standards for rigs/expressions (around vocabularies). These are, by nature, longer conversations.

Because of that, we’re going to continue to make progress on this set of extensions (in phase 1 and 2), and keep in mind the design that it should be able to support any future standards around vocabularies and rigs.

Open Questions to the Community

We would love feedback on several topic areas!

Expression-Based Mesh Activation

Should activation/deactivation of nodes or submeshes based on expressions (e.g., switching between open/closed eyes or toggling glasses) be included in the expression extension family, or separated into a visibility/scene control extension?

Ideally, we'd then be utilizing the in-progress EXT_node_visibility proposal.

Fallback Skeleton Mappings

**TL;DR - We're going to be working with AOUSD and the Metaverse Standards Forum and foster discussion on industry rig/expression vocabularies. Because of this, we don't believe we should enforce this in this phase.

Discussion Link 1 Discussion Link 2**

Should we fallback to any known skeleton mappings when the skeleton_mapping extension is used on the DCC/Content Tooling side? Is VRM's Humanoid something the community feels okay with ratifying, or is the Unity origin something that makes that less desirable? Should we fall back to the OpenXR Skeleton definition?

VRM 1.0 Humanoid

OpenXR Skeleton

LookAt Implementation: Separate or Virtual Transform?

Should a standardized LookAt construct (like VRM’s LookAt extension) be:

Modeled as a standalone extension, or
Encoded through a virtual transform?

Both options are currently viable.

Avatar vs Character Namespaces

With feedback from the community, we’ve made the decision to transition to having KHR_character be the base namespace, with KHR_character_avatar denoting a model is an avatar!

Are we missing anything obvious for our V1 set of Extensions proposed here?

We'd love to hear back as to whether we should consider adding more extensions overall to this proposal; or if any changes are needed!

References

License

All extensions in this PR are licensed under the Khronos Group glTF Extension License.

Updated skeleton_biped Updated mapping README updates Fixing contributors list as well as some logical errors in the extension schemas

DRx3D · 2025-07-29T00:05:40Z

In the Summary section where VRM Consortium is used add " (VRMC)" so usage later in the document is clear.

extensions/2.0/Khronos/KHR_avatar/README.md

aaronfranke

Where are the schemas and example files? They are missing from this PR.

I was under the impression that the Khronos avatar extensions were going to be a superset of the features provided by the VRMC extensions, but most of the good stuff from VRMC is missing here, such as spring bones and constraints.

This extension defines a 1-to-1 mapping between avatars and scenes. Most glTF implementations will only load a single scene per file. This definition explicitly prohibits having 2 avatars in a scene, and glTF does not provide a mechanism for one scene to be used in another scene. Now, in practice, people using glTF for interchange of 3D models will typically only have one avatar per file, and load multiple glTF files for multiple avatars - but I was under the impression that Khronos wants to allow single glTF files to be used as a last-mile delivery format, where the whole scene, including potentially multiple avatars, is all represented in one file. So, the question that need answering: Is it a stated goal of KHR_avatar to only allow one avatar per glTF scene/file, and is this goal isolated from the goals of KHR_interactivity and such?

Also, see #1542 - support for multiple scenes in a glTF file is extremely rare, and folks like @donmccurdy have mentioned that this could be removed if there was another compatibility breakage. Even if that won't ever happen, I would recommend avoiding building atop this "feature". Khronos's own implementations do not handle multiple scenes per file correctly - such as the Blender importer, which imports multiple scenes as collections, even though Blender itself has a multiple scenes feature.

extensions/2.0/Khronos/KHR_avatar/README.md

aaronfranke · 2025-07-29T01:48:17Z

extensions/2.0/Khronos/KHR_avatar_expression_joint/README.md

+Expression types include:
+
+- **Emotions**: `happy`, `angry`, `surprised`, etc.  
+- **Visemes**: `aa`, `ee`, `th`, `oo`, etc.  


It's not sufficient to just define "etc". The extension should define a large list of interoperable names, or else the extension does not do much to further interoperability.

Expression are allowed to define anything that is in the categories of morphshape, joints or textures so we put the vrm defaults here. You are correct. https://github.com/vrm-c/vrm-specification/blob/master/specification/VRMC_vrm-1.0/expressions.md#lip-sync-procedural

Example of a potential list of visemes:

I personally don't want to define the possible visemes. For example in Godot Engine, we decided to use unified expressions. https://docs.vrcft.io/docs/tutorial-avatars/tutorial-avatars-extras/unified-blendshapes

This is the same problem we're currently facing across the industry. We don't have standards or prevalent shared vocabularies. Once we have wider adoption, I truly believe that those that use these extensions can come together to establish those vocabularies.

For now though, establishing it without getting feedback from groups that'd use it would end in frustration. I'd much rather try to establish something that is flexible, interoperable, AND can be used for when the community comes together to form those.

For now though, the expression and joint mapping extensions are meant to provide mechanisms to map creator expressions to endpoint expressions. This extension is more to denote what an expression 'is' (animation/channel-wise). Once you have the creator/producing pipeline's concept of what the relative expressions are, mapping them to an endpoints desired/expected set becomes easier.

Example of a potential list of visemes:

These are the MPEG-4 visemes https://www.researchgate.net/figure/The-standard-set-of-visemes-specified-in-MPEG-4-and-related-phonemes_tbl1_4151888

If anyone is interested in further discussion on viseme blend shape naming standardization, discuss here: meshula/LabRCSF#5

aaronfranke · 2025-07-29T02:09:26Z

extensions/2.0/Khronos/KHR_avatar_mesh_annotation/README.md

+### 3. Region Metadata for Accessibility
+
+```json
+"tags": ["left_hand"]


Inconsistent casing.

What, exactly, is the list of allowed strings? If the answer is "anything!" then it's not very useful for interoperability, because implementations won't know which strings to listen for. What if one app uses leftHand and another uses left_hand and another uses handLeft? Then none of those will be able to read each other's data when reading a glTF file exported by the other.

I would prefer to use the rules of jsonld for defining schema-driven metadata, but we could also look at https://www.w3.org/WAI/standards-guidelines/aria/

See also json schema

My answer here would be the same as for your question/comment on the expression joint readme.

aaronfranke · 2025-07-29T02:15:03Z

extensions/2.0/Khronos/KHR_avatar_skeleton_biped/README.md

+
+| Property        | Type              | Description                                                                 |
+|----------------|-------------------|-----------------------------------------------------------------------------|
+| `joints`        | object            | Mapping from canonical biped joint names to node indices in the glTF file. |


The "canonical biped joint names" need to be explicitly defined in full, for every valid name.

Fallback Skeleton Mappings

Should we fallback to any known skeleton mappings when the skeleton_mapping extension is used? Is VRM's Humanoid something the community feels okay with ratifying, or is the Unity origin something that makes that less desirable? Should we fall back to the OpenXR Skeleton definition?

VRM 1.0 Humanoid

This is the open question of fallback definition contention mentioned earlier in the readme.

I think canonical here was likely bad verbiage on my end. Going to remove it for now; because I don't believe these extensions should define sets of fixed-name joints quite yet.

aaronfranke · 2025-07-29T02:21:02Z

extensions/2.0/Khronos/KHR_avatar_skeleton_mapping/README.md

+These standard rigs are typically defined by the consuming platform, runtime, or service provider. Each standard rig:
+- Defines a fixed joint name vocabulary and hierarchy.


The KHR avatar extensions should be what defines the fixed name vocabulary, to enable interoperability.

What do you do if a glTF file has "hand_left" and the KHR_avatar_skeleton_mapping says this maps to "handLeft" but then another platform expects "LeftHand" and another expects "HandL"?

We have a vigor debate about the fallback skeleton versus vendor specific skeletons. See readme.

I don't think I agree with this. I think that the vocabularies are by nature going to be fragmented by runtime, and there's little-to-no chance that we'd be able to (in the first phase of these extensions) convince international communities and members of industry.

I think longer term, yes, we will absolutely be able to define a vocabulary with help from the greater community. I think that comes after initial adoption and engagement with the greater community overall.

The existing fragmentation of vocabularies is precisely why I insist that Khronos should define it, in cooperation with international communities and the industry of course.

At a minimum, I suggest performing a case study to take a look at the names used in various engines and applications. In the meantime, I've been using the same bone names as RPM for my own character models, but it's arbitrary, and it would be nice to have an industry standard set of names once and for all. Also, ideally, allow excluding an explicit skeleton name map when the names match, so we don't need "Hips": "Hips" or similar if the model author chooses to name the bones in a way that already matches the standard.

_ in cooperation with international communities and the industry of course._

I think this will happen. Just not with the timing of the phase 1 extensions. We have to get them to come to the table for those conversations to happen; and I believe establishing these extensions creates that opportunity.

Bluntly, even with a case study in-hand right now, we'd still need to have extensive collaborations across the board to get to something everything is happy with. It's going to take time, and I believe we're (Khronos and the greater community) is going to make the effort to make it happen. It just won't be in time to align with these initial extensions.

As for your recommendation; that's perfectly reasonable. I believe we can make that assertion/recommendation for run times to adopt that behavior in the current state of the extension, without a standard in-place.

aaronfranke · 2025-07-29T02:23:22Z

extensions/2.0/Khronos/KHR_avatar_virtual_joints/README.md

+
+## Overview
+
+The `KHR_avatar_virtual_joints` extension introduces *virtual joints*—custom transform nodes that exist relative to the avatar’s skeletal hierarchy but are **not part of the skinned joint structure**. These virtual transforms serve as semantic attachment or control points for systems like look-at targeting, item equipping, IK hints, and seating positions.


For seating, I would love to see the KHR avatar extensions include/adopt/reference OMI_seat, which defines control points for chairs or other seats, allowing for avatars to sit on those objects. We could perhaps bring this in as KHR_avatar_seat to keep with the KHR_avatar_* naming convention (though it's not part of an avatar itself, just something in the world that an avatar interacts with - though I should also mention, it can have interaction with other categories of extensions, like being used by OMI_vehicle as a pilot seat). I am the author of OMI_seat, I would happily grant Khronos all rights to copy the extension text.

I disagree, isn't possible for OMI_seat to define a series of KHR_avatar_virtual_joints that are used internally to solve for seating?

Yes, it would be possible to define it that way, saying that for example a virtual joint named "SittingSeatKnee" or some similar name on an avatar should be used to aim towards the seat's knee control point.

I'll take a look and bring it to an upcoming working group meeting to discuss! Thanks!

extensions/2.0/Khronos/KHR_avatar_virtual_joints/README.md

fire · 2025-07-29T02:39:12Z

but I was under the impression that Khronos wants to allow single glTF files to be used as a last-mile delivery format, where the whole scene, including potentially multiple avatars, is all represented in one file. So, the question that need answering: Is it a stated goal of KHR_avatar to only allow one avatar per glTF scene/file, and is this goal isolated from the goals of KHR_interactivity and such?

This is a point of contention as I want to be able to export an entire scene with let's say 15 avatars.

fire · 2025-07-29T02:55:04Z

Where are the schemas and example files? They are missing from this PR.

Incorrect formatting. I recommend installing an IDE extension to automatically format, so these problems can be automatically solved without thinking about it.

For openness and collaboration, I and others have encouraged the early publishing of this draft pull request despite its incompleteness.

I hope that you understand that reaching for the better design is better done early, and together rather than avoiding superficial errors.

Edited:

There are parts of the story that are personal information. It is also not cool to ask for more interaction and then complain about the work's lower quality that is tradeoff for rapid response. Note that some of the contributors to Khronos Group aren't paid and/or volunteering time and effort.

Kjakubzak · 2025-07-29T03:55:06Z

Where are the schemas and example files? They are missing from this PR.
Incorrect formatting. I recommend installing an IDE extension to automatically format, so these problems can be automatically solved without thinking about it.

For openness and collaboration, I and others have encouraged the early publishing of this draft pull request despite its incompleteness.

I hope that you understand that reaching for the better design is better done early, and together rather than avoiding superficial errors.

Edited:

There are parts of the story that are personal information. It is also not cool to ask for more interaction and then complain about the work's lower quality that is tradeoff for rapid response. Note that some of the contributors to Khronos Group aren't paid and/or volunteering time and effort.

Yeah, frankly I just quickly installed VSCode on my personal computer to crunch this out while recovering from a recent procedure. You're not wrong; I should have been more diligent with formatting. Let me get to that in the next day, apologies.

aaronfranke · 2025-07-29T04:29:50Z

Avatar vs Character Namespaces

Many extensions—such as bind poses, skeleton mapping, and procedural expressions—are useful for general character models, not just avatars.

I agree with @fire's comment above about it being useful for some use cases to have multiple characters in a single glTF scene. There could be 1, 2, 15, 317, or any other amount. These characters don't necessarily need to all be human-controlled "avatars", they could be NPCs in a scene. Such NPCs have similar requirements, like retargeting, as mentioned. Or, they could be characters that human players switch between. In either case, "character" seems fitting to me.

- Added Nick Burkard to the list of contributors - Changed Verbiage where needed to ensure there's no implication of a vocabulary definition as part of the phase 1 set of extensions. - Resolved as many formatting concerns as possible

Kjakubzak · 2025-07-29T05:08:34Z

Avatar vs Character Namespaces

Many extensions—such as bind poses, skeleton mapping, and procedural expressions—are useful for general character models, not just avatars.

I agree with @fire's comment above about it being useful for some use cases to have multiple characters in a single glTF scene. There could be 1, 2, 15, 317, or any other amount. These characters don't necessarily need to all be human-controlled "avatars", they could be NPCs in a scene. Such NPCs have similar requirements, like retargeting, as mentioned. Or, they could be characters that human players switch between. In either case, "character" seems fitting to me.

I think a discussion around scene usage is probably due; especially given your concerns around it initially. I was a bit hesitant around its usage; I suspect this will be something that we need to get takes from the wider community on.

I agree I can see the use-case here; but if it's going to cause problems for adoption or tooling it might end up being a non-starter.

Frankly, I've been out of practice with git for far too long, and have forgotten that git commits for a PR are more...eternal than my current active workflow.

0b5vr · 2025-07-29T09:46:09Z

I was under the impression that the Khronos avatar extensions were going to be a superset of the features provided by the VRMC extensions, but most of the good stuff from VRMC is missing here, such as spring bones and constraints.

I believe we are currently defining feature sets that can be specified with a basis. Features like spring bones or toon materials need more discussion.

extensions/2.0/Khronos/KHR_avatar_expression_joint/README.md

extensions/2.0/Khronos/KHR_avatar_expression_procedural/README.md

0b5vr · 2025-07-30T07:06:59Z

extensions/2.0/Khronos/KHR_avatar_mesh_annotation/README.md

+
+## Overview
+
+The `KHR_avatar_mesh_annotation` extension enables arbitrary per-mesh metadata annotations for avatar models. This provides a generalized way for creators and tools to semantically tag portions of geometry for gameplay, rendering, accessibility, customization, or runtime logic.


I'm yet to grasp the intention of the extension, in contrast with KHR_avatar_mesh_annotation_rendering which has a clear behavior purpose. We probably need more real-world examples.

I can add more examples here in the near future. I've seen specific runtime needs across numerous implementations where things like mesh annotations would have assisted. It's not a top-priority extension by all means, but this does assist where runtime logic would benefit from mesh annotations/tagging.

Top-of-the-head example around accessories; there may be experiences or sub-instances within an experience that changes how the user/character/avatar interacts or perceives the experience. More specifically, say you have a character or avatar in a desert experience where the developers, for some reason, adjust shader parameters based on whether the character has sunglasses. With a metadata label denoting what a submesh or given mesh is (in this case, sunglasses), it removes a few steps for avatar creators and developers to enable scenarios like this.

That being said; this then of course leads to a "Not every experience shares a vocabulary" scenario; but at least this helps start that conversation in terms of asset labeling.

extensions/2.0/Khronos/KHR_avatar_skeleton_bindpose/README.md

0b5vr · 2025-07-30T07:31:43Z

extensions/2.0/Khronos/KHR_avatar_skeleton_mapping/README.md

+          "source": "myRig_leftFoot",
+          "targets": [
+            { "joint": "leftFoot", "weight": 0.8 },
+            { "joint": "leftToeBase", "weight": 0.2 }
+          ]


I'm afraid such one-to-many mapping might not be supported by the humanoid skeleton system in Unity. I feel like it needs a component like constraints or custom scripts, and it probably overcomplicates the implementation. I want to check the demand volume before having this

I agree, we should only have one-to-one mapping.

As an example use case, digitigrade legs on a furry character have 3 segments, but humanoid rigs have legs with 2 segments. There is a simple solution commonly used for this case: having a separate set of bones with no mesh attached, use those for the humanoid skeleton, and then use constraints to copy the transforms of the humanoid bones to the real bones, with the upper and lower leg segments both copying the humanoid thigh, where the humanoid thigh's bone length is the sum of the upper and lower leg segments. You might look at this and think that it would be nice to be able to semantically map both leg segments as the humanoid thigh, but in practice that doesn't add value. IK systems expect to work with 2-segmented humanoid legs, and full body tracking systems use real human legs as their input data, which are 2-segmented. I don't know if more complexity in the specification could improve upon what already works as a general solution. Adding the fake bones as glTF nodes and adding the constraints would only be a few hundred bytes of JSON in the glTF, and keeps implementations simple.

@0b5vr Agree that we should identify the demand volume. Let me add it to the open questions in the next day or so.

I don't think we should necessarily limit ourselves based on whether engines support this natively or not at the time of publishing. Agreed that it then leads to custom scripts/constraints; but if we provide utility that then leads to adoption, we hopefully can generate enough interest to get native support for it added.

Right now, there are several platforms where getting a custom/bespoke avatar to adhere to platform expectations typically requires switching to the rig hierarchy of that platform. This is also the case in many smaller experiences. If a given character/avatar was designed with a reduced rig in mind, this then means that the creator has to redo a large amount of work for platform compatibility.

In the case of one-to-many; spine and neck joint mappings immediately come to mind. If a creator has 3 spine bones initially, and a platform expects 5; this extension can provide the mapping to the expected 5 and enable distribution of the animation values across the target 3 (creating a smoother animation for the reduced rig set)

Perhaps part of this extension should also indicate what portion of the joint movement is desired (translation, rotation, scale). That level of granularity would likely assist in a variety of scenarios.

@aaronfranke; while that's definitely a solution, it sounds like it's orthogonal to what's being proposed here; both could exist. If anything, it sounds like the extension could actually assist with your scenario, as you could then leverage it with the scripts you're using for the retargeting for the initial mapping step (Unless I'm missing something).

I agree with the opinion that, from the perspective of glTF portability, this extension should support only 1:1 mapping.

If my understanding is correct, glTF places importance on portability, meaning that it is desirable to obtain the same results in any environment. In other words, it is preferable that any platform can handle the specification smoothly and easily.

When considering the glTF specification, I saw many discussions from the perspective of whether major platforms and engines would be able to process it without problems, and in some cases, even for features that seemed to have high demand, adoption was postponed if there were engines for which processing seemed difficult.

If it is an extension, those portability requirements may be somewhat relaxed, but if we aim for a standard KHR extension, it would be desirable to aim for the same level of portability.

I don't think we should necessarily limit ourselves based on whether engines support this natively or not at the time of publishing. Agreed that it then leads to custom scripts/constraints; but if we provide utility that then leads to adoption, we hopefully can generate enough interest to get native support for it added.

For the above reasons, I have a different view. If there are definitions that are not easy to process, support for the extension itself may not progress and its spread may be hindered, and I think that self-imposed limitations are in fact an important element for glTF. The fact that processing expected on a particular platform may no longer be reproducible in glTF, or that creators may have to redo their work, is, to some extent, an unavoidable sacrifice for standardization. Striking the right balance is the key element.

If we want to generate interest, I think there is the option of defining it as an EXT or vendor extension with a lower degree of standardization. The KHR extension would support only 1:1 mapping, the EXT/vendor extension would extend the KHR extension to also support one-to-many mapping, and on platforms that do not support the EXT/vendor extension, the KHR extension would be the fallback. This scenario would keep the problem small while providing engines, etc., an opportunity to try one-to-many mapping.

I think the discussion would go more smoothly if the rules, or at least clearer guidelines or policies, on what these extension suite does and does not cover were made more concrete. This includes not only the 1:1 mapping but also the discussion around auto.

Personally, since these are standard KHR extension suite that many engines are likely to implement and use for various purposes, I think anything that is hard to implement or becomes complex on certain platforms (especially major ones), or is likely to cause performance issues in real-time runtimes, should be considered out of scope. If needed, we could consider a separate, lower-standard extension.

I think it's going smoothly, all things considered. Honestly, the discussion around this proposal is what I'd expect the back-and-forth on extensions such as this to be.

I believe the extensions are pretty self-contained in terms of what they do and don't do; and the discussion up until now has improved the overall documentation and proposal as you've all asked for more clarity (which I appreciate). What other items are not clear; do you have any other examples that I can try to address?

In the case of 'auto', it currently is not present in the extensions (and hasn't been in any iteration).

In the case of 1-to-1 mapping versus 1-to-many; I've removed the latter for now as it's more appropriate to introduce it in a later phase (as part of the longer-term character extensions), with additional accompanying extensions for remapping/retargeting and informing motion systems. You're correct that there are potential performance issues in rare edge cases (like mapping every joint to every other joint). Do you see other performance concerns, and can you go into more details around the difficulties of implementing support for this?

extensions/2.0/Khronos/KHR_avatar_skeleton_biped/README.md

0b5vr · 2025-08-01T08:40:36Z

extensions/2.0/Khronos/KHR_avatar_mesh_annotation_rendering/README.md

+
+| Property         | Type        | Description                                                                 |
+|------------------|-------------|-----------------------------------------------------------------------------|
+| `renderVisibility` | string    | Controls camera-based visibility. Enum: `"always"` | `"firstPersonOnly"` | `"thirdPersonOnly"` | `"never"` |


auto exists in the VRM FirstPerson spec. Its behavior is to remove vertices that have a weight associated with the humanoid head at runtime.
https://github.com/vrm-c/vrm-specification/blob/master/specification/VRMC_vrm-1.0/firstPerson.md

There are several challenges to introducing auto in the KHR_avatar spec:

We must associate the skeleton extensions with the mesh annotations extension to define which bone is head.

We must recommend or define how the runtime should hide the head polygons.

Having a separate mesh before exporting would be more efficient than hiding polygons at runtime, and this might be the recommended way as a standard.

However, it's true that many VRM avatars already depend on auto.

I believe that VRChat users also don't specify first-person mesh annotations by themselves. There is a component called VRCHeadChop that specifies which mesh should be hidden in addition to the head in the first-person view.
https://creators.vrchat.com/avatars/avatar-dynamics/vrc-headchop/

I'm a little afraid of introducing Auto, as it introduces platform-to-platform differentiations (which perhaps is fine, but needs to at least be noted), even with recommendations. There's also then the challenge of runtimes then meeting creator expectations. That being said; I can see how this then makes the user workflow easier.

Totally understand that they don't denote it themselves for VRChat; that being said, I think having a component being added for this is functionally the same as them annotating the mesh (albeit on a higher level given it then happens in-engine).

Do we feel as though VRM could continue to have Auto as part of an VRM-specific extension on top of this one, or do we absolutely need to have it in this extension?

I checked the spec of auto in VRM FirstPerson, and I think it would be better not to include it in this extension. The implementation seems a bit too complex.

If my understanding is correct, there are issues such as:

Heavy per‑vertex processing, which raises performance concerns in runtime environments like viewers.

Potential complexity from mesh splitting, which could complicate other extensions as well as core glTF Mesh/Node processing.

Cross‑extension dependencies (even within the KHR_avatar extension suite). Extensions without dependencies and higher independence are simpler and preferable.

Since this avatar/character extension suite seems to be aiming for standard KHR extension, I personally think it’s better to define a low‑complexity and easy‑to‑implement specification. As a result, the extensions can be more likely to be supported across many environments and see broader adoption.

If auto is desired, I personally feel it would be more appropriate as a vendor extension maybe that extends this one.

0b5vr · 2025-08-01T08:45:31Z

extensions/2.0/Khronos/KHR_avatar_skeleton_mapping/README.md

+          { "target": "spine", "weight": 0.5 },
+          { "target": "chest", "weight": 0.5 }
+        ],
+        "JawJoint": [{ "target": "jaw", "weight": 1.0 }]


One of the VRMC members points out that the jaw should no longer be included in the humanoid skeletons since it should rather be controlled by expressions.

I think that depends on the implementation/use-case. I think it's totally fair to have a bone in the skeletal hierarchy (and an understanding as to how it maps) even with an expression that potentially powers it.

takahirox · 2025-08-01T09:08:04Z

Thank you for the important proposal. I’ve long wanted a standardized and reusable humanoid skeletal definition in glTF, so this is a very interesting proposal for me.

Since the proposal covers a lot and the discussion has already grown quite long, I may have missed some parts of the conversation, but I’d like to start by sharing a few of my thoughts. Apologies if any of this has already been outdated.

This may tie into the topic of Avatar vs Character, but I’m wondering whether functionality that is not specific to avatars could be actively separated from the Avatar extension and defined as its own general-purpose extensions. Then those could serve as foundational extensions, with the Avatar extension built on top of them.

If we first focus discussion and review on those foundational extensions, then move on to the Avatar-specific ones afterward, I believe we could progress through the specification process more smoothly. While the proposal is divided into phases, I personally feel that with so many extensions being proposed at once for Phase 1, it has become difficult to keep the discussion focused. I understand the intent was to gather broad feedback by sharing the proposal early, but for those who join later, the volume of information makes it quite hard to catch up with the ongoing discussion. Would it be possible to treat the general-purpose foundational extensions as a sort of “Phase 0”?

0b5vr · 2025-08-01T09:57:13Z

If we first focus discussion and review on those foundational extensions, then move on to the Avatar-specific ones afterward, I believe we could progress through the specification process more smoothly.

While I generally agree with the idea that we should consider each component step by step, there seem to be cases where we should think about the final imagery that multiple extensions cooperate to prevent overlooking features essential to our purpose, like the mesh annotation: "auto" discussion:
#2512 (comment)

Kjakubzak · 2025-08-01T19:14:51Z

Thank you for the important proposal. I’ve long wanted a standardized and reusable humanoid skeletal definition in glTF, so this is a very interesting proposal for me.

Since the proposal covers a lot and the discussion has already grown quite long, I may have missed some parts of the conversation, but I’d like to start by sharing a few of my thoughts. Apologies if any of this has already been outdated.

Absolutely no worries; we really have just started this conversation! Feel free to jump in with your concerns, even if they echo the concerns that have already been stated by others. Hearing back from the community will help us make informed decision/changes to the spec as it evolves!

This may tie into the topic of Avatar vs Character, but I’m wondering whether functionality that is not specific to avatars could be actively separated from the Avatar extension and defined as its own general-purpose extensions. Then those could serve as foundational extensions, with the Avatar extension built on top of them.

I agree with this conceptually, but we'd need to have consensus that the extensions themselves are general purpose enough to be separated from the avatar extension set. Right now I'm not entirely sure what could be considered general purpose enough (other than perhaps the KHR_avatar_mesh_annotation extension).

What portions of these would you consider as a general-use phase 0 set of extensions?

While I generally agree with the idea that we should consider each component step by step, there seem to be cases where we should think about the final imagery that multiple extensions cooperate to prevent overlooking features essential to our purpose, like the mesh annotation: "auto" discussion:

+1 to this; given the composition it might become harder to separate out than initially thought.

Kjakubzak · 2025-08-06T00:00:03Z

extensions/2.0/Khronos/KHR_avatar_skeleton_bindpose/README.md

+      "jointBindPoses": [
+        {
+          "joint": 0,
+          "matrix": [


There's likely a better way to represent this via an additional skin object with a different skeleton and InverseBindMatrices reference for the A/T/Custom Pose.

Realized that the formatting done on some of these areas prettified it in a way that is inconsistent with other extensions in the Khronos repo (new lines around the allOf fields). Fixed this.

takahirox · 2025-10-09T09:57:07Z

extensions/2.0/Khronos/KHR_mesh_annotation_renderview/README.md

+
+| Property         | Type        | Description                                                                 |
+|------------------|-------------|-----------------------------------------------------------------------------|
+| `renderVisibility` | string    | Controls camera-based visibility. Enum: `"always"` | `"firstPersonOnly"` | `"thirdPersonOnly"` | `"never"` |


What use cases is never intended for? For example, is it invisible by default, but shown by switching cosmetics? If that’s the use, then it isn’t camera-based, right?

I also wonder whether always is even necessary. The default behavior without this renderVisiblity setting is always visible, so always doesn’t add any new information.

For example, wouldn’t it be more convenient if this property handled only which camera types should not display it? And if, separate from this property, we allowed a simple visible/invisible setting on the Mesh, the intended use would be clearer. One advantage of doing it this way is that you could express a combination like initially invisible and hidden in FirstPersonCamera as well. With the current spec, that isn’t possible.

Proposal: (I think we can find better names)

visibility?: boolean; cameraBasedVisibility?: 'invisibleInThirdPerson' | 'invisibleInFirstPerson'; // Either one

Agreed on us needing to structure this better and have better field names. I also question always/never and appreciate you bringing it up, as I've gone back-and-forth on them.

I think having the split as you suggest makes sense (it makes me think we'd then potentially also want a mesh primitive default visibility extension like the node visibility extension; but then would need to figure out how this ladders into it).

Having the default visibility as part of this then makes me think that cameraBasedVisibility would need an option for when something is visible in both (but has an initial invisible default visibility setting).

Edit: To elaborate on that last point; I think we'd just need "Both" if it's not split; as then I could see this getting used purely for mesh primitive default visibility

Functionally, one-to-many compositions like this are describing how to achieve a particular destination expression, and the way it’s written right now could technically achieve that; but is convoluted. It should be reversed: <target expression> : { <source expression, weight>, <source expression, weight> } It also doesn’t make sense to require them to add up to 1.0, as blendshapes don’t really behave that way (and if anything, the weights should reflect how much they contribute to the expression and the max value they should be). I've made changes to make it clearer, and fixed up the schema (gave it the gltf. prepend and added proper detail as it was really a placeholder before). Following up with fixing up the schemas elsewhere as well.

Realized some of the verbiage on this has led to some misunderstandings when reading the proposal. I've added some context above the example schema as well as rephrased the first section to make it less confusing; sorry for the churn!

takahirox · 2025-10-24T09:05:21Z

extensions/2.0/Khronos/KHR_mesh_annotation/README.md

+| Property     | Type     | Description                                                          |
+| ------------ | -------- | -------------------------------------------------------------------- |
+| `tags`       | string[] | List of free-form labels applicable to this primitive                |
+| `customData` | object   | Optional free-form object for runtime-specific annotations (optional) |


Do the “tags” and “customData” properties essentially need to be separate? Would it be acceptable to combine them into a single property, where items that do not require associated data are set to an empty object? I feel that having fewer properties would be cleaner.

Example:

"annotations": [ "touchable": {}, "foo": { "color": { r: 0, g: 0, b: 0 } } ]

It might be. That being said; we're delaying this extension and figuring out if we want to use EXT_structural_metadata (and if not, will be bundling this with a future extension set proposal around metadata/LODs/renderviews).

Kjakubzak · 2025-11-28T20:46:12Z

Hi all,

Apologies for the lack of movement here. We've been doing scenario analysis around the current set of extensions to determine next steps for them. We've categorized the current set of extensions into Keep/Amend/Delay/Add.

Keep:

KHR_character
KHR_character_expression
KHR_character_expression_joint
KHR_character_expression_morphtarget
KHR_character_expression_texture
KHR_character_expression_mapping
KHR_character_skeleton_mapping

Amend:

KHR_virtual_transform - Revisiting and contemplating a node-based approach
KHR_character_bindpose - Adding support for other default pose types (such as IPose), as well as the ability to define multiple poses rather than just one
KHR_character_expression_procedural - Revisiting the primitives defined in this, expanding them, etc.

Delay:

KHR_character_avatar - While there's likely a scenario where we want to delineate between something "ready" to be an avatar vs a base character, until we have those additional extensions defined it's not very useful. We may also discover we don't want to even have a delineation when we get to that point. As a result, we're delaying this until it makes sense to have it.
KHR_mesh_annotation - Delaying due to identifying whether we want to utilize and recommend EXT_structural_metadata for this purpose. If not we'd likely iterate on extensions with an upcoming LOD/Renderview extension set proposal
KHR_mesh_annotation_renderview - Delay and add to the above-mentioned LOD/Renderview extension set proposal we'll be iterating on in the future.
KHR_character_skeleton_biped - Delaying until we have more archetype classifications to bundle in a focused proposal.

Add:

We're not sure when we'll add these; but we identified them as follow-ups for expressions

KHR_character_expression extension for multi-expression source animations (e.g. an animation with N different frames, each of which represents a different target expression; which the current design does not support)
KHR_character_expression extension for blending multiple images as an expression

The above changes will happen in the near-ish future (our next meeting is on 12/15; so we'll be iterating on the amendments needed after that point).

0b5vr · 2025-12-01T06:57:57Z

extensions/2.0/Khronos/KHR_character/README.md

+{
+  "extensions": {
+    "KHR_character": {
+      "sceneIndex": 0


Have we already discussed the property naming sceneIndex? If not, I would prefer scene instead, following the property with the same name in the core spec.

We haven't! I actually think we might want to revisit using the scene index all together due to it being not well supported across the board (and instead use a root node index concept). If we decide against that, we can change it to just being scene

0b5vr · 2025-12-01T07:06:11Z

extensions/2.0/Khronos/KHR_character_expression_morphtarget/README.md

+          "expression": "smile",
+          "animation": 0,
+          "extensions": {
+            "KHR_character_expressions_morphtarget": {


Suggested change

"KHR_character_expressions_morphtarget": {

"KHR_character_expression_morphtarget": {

Thanks for catching this! Will update/fix

0b5vr · 2025-12-01T07:06:32Z

extensions/2.0/Khronos/KHR_character_expression_morphtarget/README.md

+          "expression": "frown",
+          "animation": 1,
+          "extensions": {
+            "KHR_character_expressions_morphtarget": {


Suggested change

"KHR_character_expressions_morphtarget": {

"KHR_character_expression_morphtarget": {

0b5vr · 2025-12-01T09:18:59Z

extensions/2.0/Khronos/KHR_character_expression_morphtarget/README.md

+
+This extension **does not animate morph targets directly**. It provides metadata only.
+
+All morph target expressions should be driven using standard glTF animation channels, targeting the `weights` path on the corresponding node:


Would it also support /nodes/i/weights and /nodes/i/weights/j if the implementation supports KHR_animation_pointer?

Ref: https://github.com/KhronosGroup/glTF/blob/main/specification/2.0/ObjectModel.adoc#core-pointers

Thought it would be easier to describe morph animations using /nodes/i/weights/j if the node has many morphs and the animation wants to control only a few morphs.

You bring up a really good point; I think we likely need to change this extension to explicitely depend on KHR_animation_pointer.

While in in theory, we can get around this by providing guidance on how to utilize the 0'd-out animation frames for other expression morph targets; it would end up in a huge amount of wasted space animation-wise. Thanks for catching this!

0b5vr · 2025-12-02T05:08:41Z

extensions/2.0/Khronos/KHR_character_skeleton_mapping/README.md

+          "myRig_hips": "hips",
+          "myRig_head": "head",
+          "myRig_leftFoot": "leftFoot",
+          "myRig_rightFoot": "rightFoot",
+          "myRig_leftHand": "leftHand",
+          "myRig_rightHand": "rightHand"


the key-value direction is opposite between KHR_character_skeleton_mapping and KHR_character_expression_mapping.

Also, we might want to use an object on the value side instead of a literal value to add extra properties in the future...? low confidence.

I leave a schema idea below:

"skeletalRigMappings": { "vrmHumanoid": { "hips": { "node": 0 // if we are going to delay KHR_character_skeleton_biped, this is going to be a node index instead of a node name I believe }, "spine": { "node": 1, // we might make it optional for the sake of futureproof...? as how /animations/{}/channels/{}/target/node does "extensions": { // an example that adds extra information to the mapping, not confident enough to give a more practical example "KHR_character_skeleton_mapping_something": { "something": [ { "node": 1, "weight": 0.5 }, { "node": 2, "weight": 0.5 } ] } } } } }

Right. I updated the expressions mapping and neglected to update this; but that definitely led to a weird pattern. I think reversing it to match the expressions mapping extension is likely the immediate next step.

I don't think delaying KHR_character_skeleton_biped means this has to use the node index instead; though perhaps it'd be better to have the index as well due to there being a potential for duplicate node names. Definitely something to talk about.

0b5vr · 2025-12-03T06:40:40Z

About KHR_character_expression_texture:

Given the VRM1.0 spec compatibility, I noticed that KHR_texture_transform for texture properties in VRMC_materials_mtoon might not work with KHR_animation_pointer, since the KHR_animation_pointer spec requires destination pointers to be included in the Asset Object Model.

The property being animated MUST be mutable as defined by the glTF 2.0 Asset Object Model.

However, the spec also states that it can animate properties defined in extras fields.

Properties located in extras objects MAY be targeted as well, but validity and interpretation of the animated values is entirely application specific.

Would we assume that it is valid to mutate extension properties not mentioned in the Asset Object Model documentation if an implementation supports?
Could I point to "/materials/11/extensions/VRMC_materials_mtoon/shadeMultiplyTexture/extensions/KHR_texture_transform/offset" from the KHR_animation_pointer extension and call them a valid glTF?

Kjakubzak · 2025-12-10T21:00:53Z

About KHR_character_expression_texture:

Given the VRM1.0 spec compatibility, I noticed that KHR_texture_transform for texture properties in VRMC_materials_mtoon might not work with KHR_animation_pointer, since the KHR_animation_pointer spec requires destination pointers to be included in the Asset Object Model.

The property being animated MUST be mutable as defined by the glTF 2.0 Asset Object Model.

However, the spec also states that it can animate properties defined in extras fields.

Properties located in extras objects MAY be targeted as well, but validity and interpretation of the animated values is entirely application specific.

Would we assume that it is valid to mutate extension properties not mentioned in the Asset Object Model documentation if an implementation supports? Could I point to "/materials/11/extensions/VRMC_materials_mtoon/shadeMultiplyTexture/extensions/KHR_texture_transform/offset" from the KHR_animation_pointer extension and call them a valid glTF?

This is a much harder one; we should discuss it more during the next TSG meeting.

Personally I think this lies in the realm of "Yes, it's valid but your mileage may vary" given the example would be using the Extras objects (which is already caveated as being on a per-application basis). I'd assert that we'd likely want to recommend gracefully degrading if such a thing appears and can't be interpreted by the application.

Addressing the issues 0b5vr pointed out

RE: KhronosGroup#2512 (comment)

Delaying due to identifying whether we want to utilize and recommend EXT_structural_metadata for this purpose. If not we'd likely iterate on extensions with an upcoming LOD/Renderview extension set proposal

KhronosGroup#2512 (comment) Delaying and will add to the above-mentioned LOD/Renderview extension set proposal we'll be iterating on in the future.

hobbs-Hobbler · 2025-12-13T13:56:59Z

New person joining in here: This effort has come to my attention and I wanted to add some thoughts on behalf of non-developers, designers, artists, and learning experience professionals. Lack of specific examples and lack of standardization as highlighted by @aaronfranke and others on this proposal hold back forward momentum in the entire industry. As a beginner avatar maker, it is frustrating to make a product that, only after development, I discover doesn't work as designed in a platform. Tutorials, instructions, and deep technical pools don't help when what is needed is a standardized cross-platform clear set of expectations.

I realize that this entire proposal essentially agrees with this; I'm preaching to the choir. But a milquetoast approach of holding off on specifics when Khronos is properly authorized to set a standard is, in my opinion, a laissez-faire method of trying to not offend or cut off anyone. But we know where this road leads. As the phrase goes "One size fits all means that it essentially fits none".

Also, a word for our users, those that "wear" our avatars. Two decades of experiences in XR experiences rarely (yes sometimes but rarely!) return impactful experiences within a scene or with a view. But users do remember their avatars. They take selfies. They role play. They escape their physical boundaries. A user chooses an avatar and then the avatar shapes the user.

I don't think it can be understated how important avatars are to their users. The future of all XR experiences is hindered if a user can't easily chose to be a book, a dragon, or a Pharaoh. Clear standardization will grease the tracks for more avatar creators, which opens more XR doors for everyone.

Kjakubzak · 2025-12-15T05:46:32Z

@hobbs-Hobbler Totally understand your frustration. It's why we're iterating on this and future sets of extensions; due to any lack of standards (even in terms of basics on how to convey them). We're currently working with other contributors in the Metaverse Standards Forum on the topic; focusing first on the ongoing translation framework work going there (for now skeletal, followed by expressions. Link here to Meshula's RCSF). The hope is that will ladder up into some form of standardization that we agree on across SDOs/organizations/companies/communities, that we can incorporate into our extension sets.

Re: Examples; this diff is currently in the proposal stage and being iterated on; once we make the changes being discussed in recent conversation I expect that we'll then be producing example assets for reference.

But a milquetoast approach of holding off on specifics when Khronos is properly authorized to set a standard is, in my opinion, a laissez-faire method of trying to not offend or cut off anyone. But we know where this road leads. As the phrase goes "One size fits all means that it essentially fits none".

I'd recommend you watch the recent Metaverse Standards Forum Characters Town Hall that we assisted in organizing. I talk about the philosophy behind our decisions here (and you'll hear the perspective of other groups around character/avatar standards as well).

https://www.youtube.com/watch?v=GK6aqhJbn7w

0b5vr · 2025-12-15T10:19:17Z

About KHR_character_expression_joint use for eye movement (lookUp, lookDown in VRM):

I believe that glTF spec does not describe how we are going to play two or more animations at once. If node rotation animations are mutually exclusive, we have to prepare two joint nodes for each rotation axis of the eye movement. Like, when we have a model with a single bone for each eye, if we set 1 to both lookUp and lookRight, either will be applied, and the other won't.

I personally am okay with defining it as is, but does this match your intention?

I noticed that bone rotation animation might be mutually exclusive if we want to apply two or more at once. Left a comment on the spec PR: KhronosGroup/glTF#2512

…KHR_animation_pointer Based on 0b5vr's feedback, it became more obvious that we should have this extension rely on KHR_animation_pointer so we could animation explicit weights indices rather than the whole property. I've updated the README to reflect this.

Adding IPose pose type, as well as modified the bindpose extension to enable multiple poseType/bindpose definitions

After discussing it with the TSG; switching to the root node index makes more sense here given scenes aren't really well-supported across the board. Also added new contributors to the extension

…th KHR_character_expression_mapping It was pointed out that KHR_character_skeleton_mapping should align more with KHR_character_expression_mapping in terms of the keys being the target names, and the values being the source joint names; so I've updated it to do so.

0b5vr · 2025-12-18T03:09:23Z

I should also leave the link here; I published my testbed for checking compatibility between VRM and KHR_character. It includes a conversion script from VRM1.0 to the current KHR_character draft spec. The main motivation is to support the development and discussion of the KHR_character extension by applying it to real-world models.

https://github.com/0b5vr/khr-character-testbed

DougReeder · 2025-12-29T17:11:40Z

from the Governance Team of the Hubs Foundation:

The Governance Team of the Hubs Foundation has discussed this. We know that most avatars are designed using different conventions on how to name bones than the convention we currently use. We concluded: if there are standard names for bones and how they connect, it would be a priority for us to implement. If there aren't, we would only implement sets of bone names and connections once they had a large pool of avatars our users could draw on.

TheKHR_character_skeleton_mapping is a step in the right direction, but implementers like us would need to implement a separate case for each reference rig we support. So, without a reference rig defined in the standard, this wouldn't be a priority for us to implement.

Our current needs are modest, and almost any reference rig would work for us. Of the reference rigs we've looked at, the Reference Canonical Skeleton Framework Architecture appears to us to have the best balance of clarity and extensibility.

First iteration of proposal draft to introduce khr avatar extensions

ff1b7c9

Updated skeleton_biped Updated mapping README updates Fixing contributors list as well as some logical errors in the extension schemas

Kjakubzak marked this pull request as draft July 28, 2025 17:47

Updated Contributes list as requested

1413d20

r-hanabusa reviewed Jul 29, 2025

View reviewed changes

extensions/2.0/Khronos/KHR_avatar/README.md Outdated Show resolved Hide resolved

aaronfranke suggested changes Jul 29, 2025

View reviewed changes

Kjakubzak added 2 commits July 28, 2025 21:58

Addressing some comments

7b4723e

- Added Nick Burkard to the list of contributors - Changed Verbiage where needed to ensure there's no implication of a vocabulary definition as part of the phase 1 set of extensions. - Resolved as many formatting concerns as possible

Update README.md

de2169f

Kjakubzak added 2 commits July 28, 2025 22:34

Update README.md

de03ddb

Grammar am bad

4a1f58a

Frankly, I've been out of practice with git for far too long, and have forgotten that git commits for a PR are more...eternal than my current active workflow.

0b5vr reviewed Jul 29, 2025

View reviewed changes

extensions/2.0/Khronos/KHR_avatar_expression_joint/README.md Outdated Show resolved Hide resolved

saturday06 mentioned this pull request Jul 29, 2025

feat: KHR_character*, avatar* proposal phase1 impl saturday06/VRM-Addon-for-Blender#1011

Open

0b5vr reviewed Jul 30, 2025

View reviewed changes

extensions/2.0/Khronos/KHR_avatar_expression_procedural/README.md Outdated Show resolved Hide resolved

0b5vr reviewed Jul 30, 2025

View reviewed changes

extensions/2.0/Khronos/KHR_avatar_skeleton_bindpose/README.md Outdated Show resolved Hide resolved

0b5vr reviewed Jul 30, 2025

View reviewed changes

extensions/2.0/Khronos/KHR_avatar_skeleton_biped/README.md Outdated Show resolved Hide resolved

Updated KHR_avatar_skeleton_biped readme to be more prescriptive

685b1c3

0b5vr reviewed Aug 1, 2025

View reviewed changes

Kjakubzak commented Aug 6, 2025

View reviewed changes

Minor formatting consistency fixes

f13ade3

Realized that the formatting done on some of these areas prettified it in a way that is inconsistent with other extensions in the Khronos repo (new lines around the allOf fields). Fixed this.

takahirox reviewed Oct 9, 2025

View reviewed changes

Kjakubzak added 2 commits October 13, 2025 13:09

takahirox reviewed Oct 24, 2025

View reviewed changes

0b5vr reviewed Dec 1, 2025

View reviewed changes

0b5vr reviewed Dec 2, 2025

View reviewed changes

Kjakubzak added 4 commits December 10, 2025 13:02

Fixing KHR_character_expression_morphtarget inconsistencies

df66d5f

Addressing the issues 0b5vr pointed out

Delaying KHR_character_avatar until we have more specific use-cases.

1812744

RE: KhronosGroup#2512 (comment)

Delaying KHR_mesh_annotation

3e4bff5

Delaying due to identifying whether we want to utilize and recommend EXT_structural_metadata for this purpose. If not we'd likely iterate on extensions with an upcoming LOD/Renderview extension set proposal

Delaying KHR_mesh_annotation_renderview

8caa030

KhronosGroup#2512 (comment) Delaying and will add to the above-mentioned LOD/Renderview extension set proposal we'll be iterating on in the future.

Kjakubzak added 7 commits December 16, 2025 23:02

Delaying KHR_character_skeleton_biped

1e93bfa

KHR_character_skeleton_bindpose update

34c6689

Adding IPose pose type, as well as modified the bindpose extension to enable multiple poseType/bindpose definitions

Update KHR_character to use rootNode instead of sceneIndex

9dcf075

After discussing it with the TSG; switching to the root node index makes more sense here given scenes aren't really well-supported across the board. Also added new contributors to the extension

Individual Contributor ->Independent Contributor

607430e

Updating README contributor lists with new TSG contributors

1c91e28

		These standard rigs are typically defined by the consuming platform, runtime, or service provider. Each standard rig:
		- Defines a fixed joint name vocabulary and hierarchy.


		## Overview

		The `KHR_avatar_virtual_joints` extension introduces virtual joints—custom transform nodes that exist relative to the avatar’s skeletal hierarchy but are not part of the skinned joint structure. These virtual transforms serve as semantic attachment or control points for systems like look-at targeting, item equipping, IK hints, and seating positions.


		## Overview

		The `KHR_avatar_mesh_annotation` extension enables arbitrary per-mesh metadata annotations for avatar models. This provides a generalized way for creators and tools to semantically tag portions of geometry for gameplay, rendering, accessibility, customization, or runtime logic.

	"KHR_character_expressions_morphtarget": {
	"KHR_character_expression_morphtarget": {


		This extension does not animate morph targets directly. It provides metadata only.

		All morph target expressions should be driven using standard glTF animation channels, targeting the `weights` path on the corresponding node:

[PROPOSAL] Khronos Avatar Extensions - Phase 1 #2512

Are you sure you want to change the base?

[PROPOSAL] Khronos Avatar Extensions - Phase 1 #2512

Conversation

Kjakubzak commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Draft: KHR Character and Avatar Extension Set – Phase 1

Summary

Motivation

Phase 1 Scope

Core Extensions

Expression Extensions

Expression Extensions - Core

Expression Extensions - Metadata-adjacent

Skeleton/Rig Extensions

General Extensions

Mesh Annotation Extensions

Virtual Transform Extension

Design Principles/Philosophies

Future Work

Open Questions to the Community

Expression-Based Mesh Activation

Fallback Skeleton Mappings

LookAt Implementation: Separate or Virtual Transform?

Avatar vs Character Namespaces

Are we missing anything obvious for our V1 set of Extensions proposed here?

References

License

Uh oh!

DRx3D commented Jul 29, 2025

Uh oh!

Uh oh!

aaronfranke left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fire Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aaronfranke Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fire Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fire Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kjakubzak Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kjakubzak commented Jul 28, 2025 •

edited

Loading

aaronfranke left a comment •

edited

Loading

fire Jul 29, 2025 •

edited

Loading

aaronfranke Dec 2, 2025 •

edited

Loading

fire Jul 29, 2025 •

edited

Loading

fire Jul 29, 2025 •

edited

Loading

Kjakubzak Jul 29, 2025 •

edited

Loading

aaronfranke Jul 29, 2025 •

edited

Loading

aaronfranke Jul 29, 2025 •

edited

Loading

fire commented Jul 29, 2025 •

edited

Loading

Kjakubzak commented Jul 29, 2025 •

edited

Loading