MIDI 2 Specification Part 1: UMP and MIDI-CI
Introduction
MIDI 2.0 is a major update to the MIDI protocol, moving from the simple 8-bit serial byte stream of MIDI 1.0 to a structured, high-resolution packet-based format. On a protocol level, one big change is that MIDI 2.0 is bidirectional – devices can both send and receive, enabling two-way “conversations” rather than the one-way messages of MIDI 1.0. This allows MIDI 2.0 gear to automatically negotiate features and fall back to MIDI 1.0 if needed for compatibility. Another fundamental change is the adoption of the Universal MIDI Packet (UMP) format to carry messages. UMP packets are 32 bits wide (or larger multiples of 32 bits), instead of the 8-bit bytes used in MIDI 1.0. This new format lets MIDI 2.0 transmit much more data per message – for example, 32-bit values instead of 7-bit – providing dramatically higher resolution for controllers and parameters. It also expands the addressing scheme. Instead of the 16 channels of MIDI 1.0, MIDI 2.0 supports 16 groups of 16 channels (up to 256 channels total) by using a 4-bit Group field in the packet. Despite these expansions, MIDI 2.0 maintains backward compatibility by design. MIDI 2.0 devices can detect if a connected device only supports MIDI 1.0 and then communicate in MIDI 1.0 mode transparently. In short, MIDI 2.0 introduces a far more expressive and flexible protocol (higher resolution, per-note control, better timing) while still respecting the fundamentals of MIDI 1.0 for compatibility.
Why a new packet format? The introduction of UMP is to overcome the 1980s-era limitations of MIDI 1.0’s byte protocol. MIDI 1.0 messages were limited to 7-bit data values (0–127) and had fixed small lengths (typically 1–3 bytes per message). MIDI 2.0’s UMP format replaces the old status/data byte stream with a 32-bit aligned packet structure that can encapsulate both MIDI 1.0 and new MIDI 2.0 messages. This packetization is transport-agnostic (it can run over USB, network, etc.) and reserves space for future expansion of the protocol. By using UMP, MIDI 2.0 can bundle more information into each message (for example, 32-bit controller values, per-note data, new message types) without breaking older devices. In summary, on a protocol level MIDI 2.0 differs from 1.0 by using a modern packet format with higher data widths, adding bi-directional negotiation, and greatly increasing the precision and range of musical data – all while ensuring older MIDI devices can still communicate by default. The following sections provide a deep dive into these new aspects of MIDI 2.0, focusing on bits, bytes, and protocol structures.
Universal MIDI Packet (UMP) Format
MIDI 2.0 defines the Universal MIDI Packet (UMP) as a container for all MIDI messages (both 1.0 and 2.0 protocols). Unlike the MIDI 1.0 “stream of bytes” format, UMPs are fixed-size 32-bit words (or groups of words) that encapsulate an entire MIDI message. A UMP can be 1, 2, 3, or 4 words long (i.e. 32, 64, 96, or 128 bits) depending on the message type. The first 32-bit word of every UMP has a standardized header structure: it includes a 4-bit Message Type field and a 4-bit Group field at the top, followed by fields specific to that message type. In binary, the first 32-bit word can be visualized as follows:
bits: 31‒28 27‒24 23‒16 15‒0 MT Group Status (depends on type)
- Message Type (MT) – 4 bits (bits 31–28): This identifies the kind of message and implicitly its size/format. For example, a Message Type of
0x1
indicates a System message (32-bit long),0x2
indicates a MIDI 1.0 Channel Voice message (32-bit),0x4
indicates a MIDI 2.0 Channel Voice message (64-bit),0x5
indicates a Data message like SysEx (128-bit), etc. The MT not only labels the message category, but also tells the receiver how many 32-bit words to read for the message. (Unused values are reserved for future expansion.) - Group – 4 bits (bits 27–24): This is an optional group address used to expand addressing beyond 16 channels. Each group is like a separate port or bus; a Group value of 0–15 addresses one of 16 groups. Within each group, the familiar 16 MIDI channels exist. In essence, Group extends the MIDI channel space – if not needed, it can be zero. Groups allow up to 16×16=256 channels in one MIDI network, or can be used to route messages to different device virtual ports over a single link.
- Status (Opcode) and Channel – 8 bits (bits 23–16): This byte is analogous to the MIDI 1.0 status byte, but split into a 4-bit opcode (the message code) and a 4-bit channel number. The opcode indicates the specific message within the category (e.g. Note On, Control Change, etc.), and the channel is the MIDI channel 0–15 for that message. In MIDI 2.0 Channel Voice messages (MT = 0x4), for example, the opcode 0x9 indicates “Note On” and the 4-bit channel indicates which channel (just as 0x90–0x9F were Note On status bytes in MIDI 1.0). For System messages (MT = 0x1), this Status field contains the system message code (like MTC, Song Select, etc.) since system messages are not channel-specific. For Utility/stream messages, this field is defined accordingly (e.g. a timestamp marker code for JR Timestamp, described later).
After the first 32-bit word, the remaining words (if any) carry the payload data of the message. MIDI 2.0 greatly expanded these data fields beyond the old 7-bit bytes. Notably, MIDI 2.0 Channel Voice messages use 64 bits (2 words): the first word contains the header (type, group, opcode/channel, etc.) and part of the data, and the second word contains the rest of the data. In general, MIDI 1.0 voice messages (like Note On/Off, CC, etc.) are encoded in a single 32-bit UMP word, whereas MIDI 2.0 voice messages use two 32-bit words (64 bits) to include higher-resolution values and additional fields. System Exclusive messages and other extended data use even larger packet sizes (up to 128 bits) to carry long byte sequences. The UMP format thus supports four packet sizes (1–4 words) to accommodate different message types:
- 32-bit UMP (1 word): Used for short messages like Utility messages, System Common/Real-Time, and encapsulated MIDI 1.0 channel messages).
- 64-bit UMP (2 words): Used for MIDI 2.0 Channel Voice messages and short (up to 6 bytes) SysEx or data messages.
- 96-bit UMP (3 words): Used if needed for certain data lengths (not commonly used by standard messages, but available for future expansion).
- 128-bit UMP (4 words): Used for large data messages like extended SysEx streams or the new Mixed Data Set messages (which can carry non-MIDI data).
UMP Header Breakdown: The UMP header (first word) has specific encoding rules. We covered the 4-bit MT and Group. The next byte (status/opcode + channel) is structured similarly to MIDI 1.0 status bytes, with the high nibble as a code and low nibble as channel. For instance, a UMP with MT = 0x2
(MIDI 1.0 voice message) might have a Status byte of 0x90
(1001nnnn in binary, where 1001 is Note On opcode and nnnn is channel) to indicate a Note On on that channel. The remaining bytes of the packet contain the data bytes of the MIDI message, padded or extended to fit 32-bit alignment. In a 32-bit UMP carrying a MIDI 1.0 message, there are 3 bytes available after the status byte – enough for the two data bytes of a Note On/Off or Control Change, plus one padding byte if needed. For example, a MIDI 1.0 Note On (status 0x9n, note = 60, velocity = 100) on channel 1 could be encoded as:
- Byte1:
0x2F
(0x2 in the high nibble for “MIDI1 Voice Message”, 0xF in low nibble for Group 15 in this example) - Byte2:
0x91
(0x9 Note On opcode, 0x1 channel number) - Byte3:
0x3C
(0x3C = 60 decimal, the Note Number, with bit7=0 as per MIDI 1.0 data byte rules) - Byte4:
0x64
(0x64 = 100 decimal, the Velocity, with bit7=0)
This 32-bit word [0x2F 91 3C 64]
encapsulates the entire MIDI 1.0 Note On message. In the UMP spec’s notation, this appears as 0x2 gggg 1001nnnn rkkkkkkk rvvvvvvvv
(where r
indicates the required 0 in the MSB of data bytes). By contrast, a MIDI 2.0 Note On is encoded in a 64-bit UMP (2 words). The first word would have MT = 0x4
and status 1001nnnn
(Note On opcode & channel), and the second word carries 16-bit Velocity and 16-bit Attribute Data (extra note-specific data). In practice, the 16-bit Index field in the first word is split into an 8-bit Note Number and an 8-bit Attribute Type, and the 32-bit second word is split into a 16-bit Velocity and 16-bit Attribute Value. So the Note On message went from 3 bytes of payload in MIDI 1.0 to effectively 6 bytes of payload in MIDI 2.0 (not counting the type/group nibble). This allows higher resolution velocity and an additional parameter (the Attribute) which can be used for things like articulation control or per-note pitch tweak (more on that later).
UMP Message Types: The Message Type (MT) field is key to understanding how MIDI 2.0 handles various message categories. The MIDI 2.0 spec currently defines MT values 0x0 through 0xF, with several in active use:
- 0x0 – Utility: Utility or stream messages (like timestamps for jitter reduction, transport controls for files, etc.) that are 1-word long. An example is the Jitter Reduction Timestamp messages (JR Clock and JR Timestamp) which use Utility type messages.
- 0x1 – System Common & Real-Time: This covers MIDI system messages (System Common like MTC quarter-frame, Song Select, etc., and System Real-Time like Clock, Start, Stop). These are 1-word messages carrying the system status byte and any data bytes. For instance, a MIDI Clock (0xF8 in MIDI 1.0) would be encoded as MT=1 with status 0xF8 and no data bytes in the UMP.
- 0x2 – MIDI 1.0 Channel Voice: Encodes any standard MIDI 1.0 channel voice message (Note Off/On, Poly Pressure, Control Change, Program Change, Channel Pressure, Pitch Bend) in 32 bits. Essentially, this is a legacy MIDI message wrapped in a UMP. The status and 7-bit data bytes are placed into the 32-bit packet (with an extra padding byte if needed). For example, as shown above, a Note On uses two data bytes and one pad byte; a Program Change uses one data byte (program number) and one pad byte. This type ensures that MIDI 2.0 devices can transmit and receive old MIDI messages without ambiguity.
- 0x3 – 8-byte Data Messages: These are 2-word (64-bit) packets designed mainly for System Exclusive 7-bit messages (the traditional SysEx format) and other long data streams. Since SysEx messages can be longer than one packet, there are sub-types indicated by the Status field: e.g., Start of SysEx, Continue, End, or a Complete SysEx in one UMP. The 8-byte UMP can carry up to 6 bytes of actual SysEx data per packet (because 2 bytes in the packet are used for a length and status). These “SysEx7” messages still use the 7-bit data encoding (where data bytes have bit7=0 and an 0xF7 End-of-SysEx byte terminates the stream), but UMP provides a length indicator and stream ID to segment the SysEx if it spans multiple packets.
- 0x4 – MIDI 2.0 Channel Voice: This is one of the core additions of MIDI 2.0. MT=0x4 indicates a MIDI 2.0 Channel Voice message, which is 2 words (64 bits) long and contains extended resolution data. All the familiar channel voice messages have a MIDI 2.0 version under this type: Note Off, Note On, Poly Pressure, Control Change, Program Change, Channel Pressure, and Pitch Bend – plus additional ones for per-note controllers (discussed in the next section). Each of these uses 16-bit or 32-bit data fields instead of 7-bit. For example, a MIDI 2.0 Control Change still identifies the controller by number (0–127, placed in the 16-bit index field perhaps) but the value is a full 32-bit value. A MIDI 2.0 Pitch Bend sends a 32-bit bend value instead of two 7-bit bytes. We will break down these messages in detail later. In UMP, a MIDI 2.0 voice message’s first 32-bit word contains MT, Group, Status (opcode & channel), and a 16-bit Index (which often encodes things like the note number or controller number), and the second 32-bit word contains a 32-bit Value field (e.g. velocity, controller value, pitch).
- 0x5 – 16-byte Data Messages: These are 4-word (128-bit) packets, used for the new System Exclusive 8 messages and Mixed Data Set messages. SysEx 8 is a new SysEx format that allows bytes of any value (full 8-bit data) and encodes the message length explicitly (so it doesn’t rely on an 0xF7 terminator). SysEx8 messages are broken into 16-byte UMPs with indicators for start, continuation, and end similar to SysEx7, but can carry 14 bytes of data per packet (since the first 2 bytes in each packet are used for a 14-bit length and a stream ID). The advantage is that SysEx8 can transport arbitrary binary data (0–255) without the special bit7 handling of SysEx7, and with higher efficiency. The Mixed Data Set message is another 128-bit type, intended for large non-SysEx data transmissions (it includes a header to indicate the type of data, potentially for future MIDI-Digital fusion or bulk transfers).
- 0x6, 0x7, 0x8, 0x9, 0xA, 0xB, 0xC: (Reserved for future use by MMA/AMEI) – These values are left open for defining new message types as the protocol evolves.
- 0xD – Flex Data: A 128-bit message type meant for manufacturer-specific or future expansion data. (Not widely used yet in 2025; think of it as an extension bucket for experimental messages).
- 0xE: (Reserved)
- 0xF – Stream: 128-bit UMP Stream messages. This category includes messages that affect the entire MIDI stream or timing, like the Jitter Reduction Clock messages. Essentially, MT=0xF is used for system-wide timing and management messages that don’t fit the other categories.
In summary, the UMP format is a unified packet structure that can carry any MIDI 1.0 or 2.0 message by specifying a Message Type. It expands the size of messages (beyond the 3-byte limit of MIDI 1.0) up to 16 bytes if needed, and aligns everything to 32-bit boundaries for efficiency on modern systems. The structured fields (Type, Group, Status, Index, etc.) mean that parsing a MIDI 2.0 message is straightforward – no running status or ambiguous message lengths as in MIDI 1.0’s stream. This predictable format also makes it easier to transport MIDI over high-speed links like USB or Ethernet, and even to save MIDI 2.0 data in files (Standard MIDI File v2.0 will simply record UMP packets). The UMP is the container that makes all of MIDI 2.0’s new capabilities possible, providing more bits when and where needed and leaving room for growth.
Bidirectional Communication and MIDI-CI
A cornerstone of MIDI 2.0 is that devices can talk to each other to negotiate features and exchange information. This is accomplished with the MIDI Capability Inquiry (MIDI-CI) protocol. In MIDI 1.0, there was no standard way for two devices to handshake capabilities – one device simply sent MIDI data and hoped the receiver understood it. MIDI 2.0 introduces a formal query/reply system (using SysEx messages) so that devices can discover each other’s supported features and configure themselves accordingly. This negotiation is what allows two MIDI 2.0 devices to agree to use the MIDI 2.0 protocol, or to fall back to MIDI 1.0 if one side doesn’t support 2.0, ensuring backward compatibility.
MIDI-CI Message Structure: MIDI-CI messages are a set of Universal System Exclusive messages defined specifically for inquiry and negotiation. They use the Universal Non-Realtime SysEx ID (0x7E, or 0x7F for realtime) and dedicated sub-ID bytes to indicate a MIDI-CI message. At the byte level, each MIDI-CI SysEx message looks like this:
F0 7E <DeviceID> 0D <SubID2> <Version> <Source MUID (4 bytes)> <Destination MUID (4 bytes)> <Data bytes...> F7
F0
– Start of SysEx.7E
or7F
– Universal SysEx ID (7E = Non-Realtime, 7F = Realtime). MIDI-CI can use either, but typically Non-Realtime (7E) is used for capabilities negotiation.<DeviceID>
– One byte target identifier. This works like MIDI’s normal SysEx Device ID:7F
indicates a broadcast (to all devices on the port), whereas00
–0F
can address a specific device (often representing MIDI channels 1–16 on a port). In practice, an Initiator (the device starting the inquiry) will often send with DeviceID = 7F (port-wide inquiry) to discover any device that can respond. The responder will usually reply with DeviceID of the initiator (or 7F for port-wide if appropriate).0D
– Sub ID #1 identifying MIDI-CI. All MIDI-CI messages use 0x0D here, marking the message as a MIDI Capability Inquiry messag.<SubID2>
– A second ID byte that specifies the category and function of the MIDI-CI message. The MIDI-CI spec defines several categories by ranges of SubID2:0x10–0x1F
: Protocol Negotiation Messages – used to negotiate which MIDI protocol to use (1.0 or 2.0).0x20–0x2F
: Profile Configuration Messages – used to query or set MIDI Profiles (standardized sets of controller behaviors for an instrument type).0x30–0x3F
: Property Exchange Messages – used to inquire, get, or set device properties (like product name, parameters, patch lists, etc., often using JSON data).0x00–0x0F
and0x40–0x7E
are reserved (not used in MIDI-CI v1.0), and0x7F
is defined as a universal NAK (negative acknowledgment) for error handling.
<Version>
– One byte indicating the MIDI-CI message version/format. (This allows future updates to MIDI-CI. Current MIDI-CI uses version 0x01.)- Source MUID – 4 bytes. This is a unique 32-bit ID for the initiator of the transaction (MUID = MIDI Unique ID). The initiator includes its ID so the responder knows who it’s talking to. MUIDs are usually randomly generated or assigned by the host for each port – they help differentiate multiple devices or multiple ports on one device.
- Destination MUID – 4 bytes. In an Inquiry, this is often 0 or 0x7F... if the initiator doesn’t know who will respond. In a reply, the responder fills this with the initiator’s MUID (echoing it back). Essentially, MUIDs act like addresses in a conversation, ensuring replies can be matched to senders especially in a broadcast scenario.
- Data bytes... – The payload specific to the SubID2 category. Each type of CI message defines its own data format here. For example, for Protocol Negotiation, the data might include supported protocol bits and an “authority level.” For Profile Inquiry, it might include a list of profile IDs. For Property Exchange, it might include JSON or property IDs, etc.
F7
– End of SysEx.
All MIDI-CI messages follow this common wrapper (7E 0D … F7) with their specific content inside. This design means older MIDI devices (which don’t understand 0x0D SysEx) will simply ignore these messages, whereas MIDI 2.0 devices will recognize and process them.
Capability Negotiation Process: Using these SysEx messages, two devices perform a handshake to decide how they will communicate and what features to use. One device acts as the Initiator (sending inquiries), and the other as the Responder. For example, when a new MIDI 2.0 controller connects to a synth, the controller might initiate a MIDI-CI exchange to find out if the synth supports MIDI 2.0 and what profile it might use. The typical sequence for Protocol Negotiation is:
- Inquiry (Protocol Negotiation Inquiry) – The Initiator sends a SysEx inquiry with SubID2 =
0x10
(Initiate Protocol Negotiation). This message not only asks “do you support MIDI 2.0?” but also reports the initiator’s own capabilities. Inside the data bytes, it includes a bitfield of supported protocols (e.g. a bit for MIDI 1.0, a bit for MIDI 2.0) and possibly other info like an “authority level.” The authority level is used if both devices try to initiate at the same time – the one with higher authority will act as initiator to avoid collision. Typically, a device will set bits indicating it can do MIDI 1.0 and MIDI 2.0 if it is 2.0-capable. If it’s a MIDI 2.0 device, it sets the MIDI 2.0 support bit; if it only does 1.0, it wouldn’t even engage in CI, so no reply would come. - Response (Protocol Negotiation Reply) – If the other device understands the inquiry, it replies with SubID2 =
0x11
(Report Protocol Capabilities). The data bytes in the response include that device’s supported protocols bitfield and its own authority level. Now both devices know what each other can support. - Set New Protocol – Based on the exchange, if both support MIDI 2.0, the Initiator will send SubID2 =
0x12
(Set New Protocol) to propose switching to MIDI 2.0. This message specifies the protocol to use (e.g., a byte indicating MIDI 2.0, and possibly the version of MIDI 2.0). The responder would then switch its parsing and generation to MIDI 2.0 format. At this point, the devices have “agreed” to speak MIDI 2.0 on that port. - Protocol Test – The spec provides an optional step to verify the new protocol. For example, the Initiator can send a known test message in MIDI 2.0 format (SubID2 =
0x13
might be defined for “Protocol Test”) and expect a valid response. If the test fails (e.g. the responder doesn’t properly handle it), then they can fall back to MIDI 1.0. In practice, this fallback rarely needs to happen if negotiation was successful, but it’s a safety mechanism. The MIDI-CI spec explicitly says devices should revert to MIDI 1.0 if a negotiated feature isn’t working.
All of the above negotiation happens quickly and only at connection time. If a device does not respond to the initial inquiry (likely because it’s a MIDI 1.0 device with no knowledge of CI), the Initiator will realize that and simply continue operating in MIDI 1.0 mode. This is how backward compatibility is gracefully handled – new devices try a handshake, and if there’s no “conversation,” they assume the other end only understands MIDI 1.0 and stay in that mode.
Beyond protocol selection, MIDI-CI also negotiates other aspects:
- Profile Configuration: MIDI 2.0 devices can support Profiles, which are standardized sets of behaviors for certain instrument types (e.g. a “Piano Profile” for all 88-note digital pianos). Using MIDI-CI Profile messages (SubID2 0x20–0x2F), devices exchange a list of Profiles they support. If there’s a match, the Initiator can send a message to turn that Profile on. For instance, if both devices support the Piano Profile, they will configure themselves to use that profile’s defined controllers (so things like soft pedal, sostenuto, etc., are known to both). This greatly improves interoperability – two devices can automatically align their usage of MIDI messages to a common standard for that instrument type.
- Property Exchange (PE): Using SubID2 0x30–0x3F, devices can perform bulk data queries and edits using a standardized format (JSON embedded in SysEx). Property Exchange allows a device to ask another for properties like its name, manufacturer, presets, or even get/set detailed parameters in a synth. For example, a DAW could query a MIDI 2.0 synthesizer for its list of patch names via PE messages, and display them to the user. The data is exchanged in chunks of SysEx (often using a special SysEx8 stream due to potentially large size). The negotiation part of PE involves querying Property Exchange capabilities (like how many simultaneous property requests can be handled, maximum message size, etc.), then using a defined SysEx structure to request or send properties (identified by GUIDs or strings, with JSON values). This is a powerful new area that goes beyond “MIDI messages” into full device management – essentially making MIDI 2.0 a carrier for device API calls, all still within SysEx so it’s backward compatible (a MIDI 1.0 device would just ignore these messages if it somehow saw them).
As an example of bit-level structure, consider the Protocol Negotiation Inquiry message (SubID2 = 0x10) contents:
- After the header (DeviceID 7F, 0D, 10, version, MUIDs), the data might contain: a 1-byte bitfield where bit0 = “MIDI 1.0 support”, bit1 = “MIDI 2.0 support”, etc., followed by a 1-byte “authority level” (0–127, higher means this device prefers to be initiator). For instance, a MIDI 2.0 device might send 0x03 for support (binary 11, meaning it can do 1.0 and 2.0) and 0x40 for authority. The reply would include the responder’s support bits and its authority.
- A Set New Protocol message (0x12) likely has a 1-byte field indicating the chosen protocol (e.g. 0x02 for MIDI 2.0) and possibly the MIDI CI version or protocol version number.
The exact byte-level definitions are in the MIDI-CI spec, but the main point is that it’s all bit-wise defined within SysEx data and uses reserved ID values so as not to conflict with anything pre-existing. MIDI 2.0 devices are expected to implement at least Protocol Negotiation (so they don’t attempt 2.0 with someone who can’t).
In summary, MIDI-CI is the mechanism that makes MIDI 2.0 devices self-configuring. Through these negotiation messages, devices can discover each other’s protocol support, agree on using MIDI 2.0, enable specific Profiles for common controller semantics, and perform Property Exchange to read/write settings. This greatly reduces manual setup – for example, two MIDI 2.0 instruments can automatically enable per-note expression if both support it, or a DAW can query a controller to label its knobs and faders automatically. All of this is enabled by SysEx-based messages structured down to the byte and bit level as described. MIDI 2.0’s use of MIDI-CI thus transforms MIDI from a monologue into a dialogue, where devices can coordinate how they will communicate, rather than just blindly sending data.
Increased Data Resolution and Parameter Scaling
One of the most significant improvements in MIDI 2.0 is the dramatic increase in data resolution for virtually all values. MIDI 1.0 was fundamentally limited to 7-bit resolution (0–127) for most parameters (notes, CCs, aftertouch, etc.), with a few 14-bit exceptions (pitch bend and some high-resolution CC pairs). MIDI 2.0 blows past these limits by allowing up to 32-bit resolution for controller values, velocities, and other parameters. This means instead of 128 discrete steps, you can have over 4 billion discrete values for a parameter, enabling ultra-fine control.
7-bit to 32-bit Expansion: In MIDI 2.0 Channel Voice messages, any data that was 7-bit in MIDI 1.0 is typically expanded to 32 bits (or to 16 bits in specific cases like note velocity). For example, continuous controllers (CCs) in MIDI 2.0 transmit 32-bit values. A standard MIDI 1.0 CC like Modulation (CC#1) which ranged 0–127 now can range from 0 to 4,294,967,295 in MIDI 2.0. This enormous range allows extremely fine adjustments – effectively continuous, analog-like control without perceivable stepping. Similarly, pitch bend, which was 14-bit (0–16383) in MIDI 1.0, can be up to 32-bit in MIDI 2.0 for even finer pitch adjustments (the spec suggests 21-bit or 32-bit resolution for pitch bend data).
To maintain compatibility and musical meaning, MIDI 2.0 defines how to scale up 7-bit or 14-bit values to 32-bit, and conversely how to downscale when translating back to MIDI 1.0. The general approach is to treat the MIDI 1.0 value as the most significant bits of the MIDI 2.0 value and use the additional bits for finer resolution (like fractional increments). Concretely, a 7-bit value can be thought of as a 7.25 fixed-point number when expressed in 32 bits (7 bits of integer and 25 bits of fraction). For example, the value 127 (maximum 7-bit) would scale to 0x7F000000
or 0x7FFFFFFF
in 32-bit, depending on whether you scale to full range or leave room for rounding. The MIDI 2.0 spec provides recommended scaling rules:
- Upscaling (Low-Res to High-Res): When converting a 7-bit value to 32-bit, one simple method is to left-shift the value by 25 bits (i.e., multiply by 2^25). This maps 0→0 and 127→0xFE000000 (which is 4261412864 in decimal). Some implementations might multiply by 33818640 (which is 0x02020220) so that 127→0xFFFFFFFF (4294967295), fully using all bits . Either way, the values in between are distributed linearly. For instance, 7-bit value 64 (middle) would map to around 0x80000000 in 32-bit. Critically, the mapping is defined such that if you later downscale (by right-shifting 25 bits), you recover the original 7-bit value. This ensures compatibility – a 7-bit only device will interpret the top bits correctly.
- Downscaling (High-Res to Low-Res): When converting 32-bit to 7-bit (for example, sending to a MIDI 1.0 device), the spec suggests to use rounding to the nearest representable value. Essentially, take the top 7 bits of the 32-bit value. If one had used the straightforward left-shift for upscaling, this is just a matter of dropping the lower 25 bits. For example, suppose a 32-bit controller value is 0x4000_0000. That is exactly 25% of full scale. The corresponding 7-bit value would be 32 (25% of 127 ≈ 32). Indeed, 0x4000_0000 >> 25 = 0x20. The spec’s tables show examples: a 7-bit value of 5 becomes 0x0A00 in 16-bit (2560 decimal); and a 16-bit value of 0x8000 downscales to 64 (the midpoint).
16-bit Values: Not every value uses the full 32 bits – note velocity in MIDI 2.0 is defined as 16-bit (0–65535). The choice of 16 bits for velocity was a balance between resolution and message size (since Note On/Off already have added attributes). 16 bits gives 256 times the resolution of MIDI 1.0 velocity, which is plenty for nuanced dynamics. The scaling for velocity is simple: if you convert a MIDI 1.0 velocity to MIDI 2.0, just multiply by 256 (shift left 8 bits) so that 127→0x7F00 which is 32512 (since 127256=32512). Actually, the spec suggests 127 maps to 0xFFFF (65535) for velocity to use the full range, in which case you’d multiply by 516 instead of 256 to scale (since 127516 ≈ 65532, then plus some rounding gives 65535). Either approach yields minimal difference; the key is that velocity=0 in MIDI 2.0 still means no sound (with the caveat that Note On vel=0 isn’t used for Note Off as mentioned earlier).
To illustrate: 7-bit vs 32-bit example:
Take MIDI 1.0 Pan (CC10) at center = 64. In MIDI 2.0, that might be represented as 0x8000_0000 (roughly half of 0xFFFFFFFF). If you had Pan = 65 (one step right of center) in 7-bit, in 32-bit that might be 0x8200_0000 or similar. The increments are extremely fine now. Conversely, if a synth sends a high-resolution value like 0x8100_0000 and it’s being received by a MIDI 1.0 device, the receiving system will scale it down to 0x81 >> 25 = 0x02 (i.e., 2 out of 127, which might correspond to something like value=2).
Pitch Bend and Fine Resolution: MIDI 1.0 pitch bend had 14-bit resolution (~16384 steps). Many instruments used only a portion of that (though some did microtonal). MIDI 2.0 can send pitch bend with 32-bit (which the spec calls 7.25 format if fully used). But practically, not all synths will use all 25 fractional bits. Some might use 7.14 (21-bit) which is already 16384 times finer than MIDI 1.0! The spec explicitly allows 16, 21, or 32-bit for pitch. This means a manufacturer might decide their synth engine is precise enough for 14 fractional bits (giving over 16000 steps per semitone, which is far beyond human hearing resolution), and just use that. They would still receive a 32-bit bend, but perhaps ignore the lowest 11 bits. The idea is that MIDI 2.0 doesn’t force everything to use 32-bit internal math, but provides it so nothing is bottlenecked by the transport.
Per-Note Parameters: The increased resolution also extends to the new per-note controllers (coming up next). For example, MIDI 2.0 defines Per-Note Pitch as a 32-bit value representing a fine offset for the pitch of an individual note. This can allow digital instrument techniques like tuning each note of a chord in just intonation on the fly, or achieving the seamless pitch bends for individual notes as in MPE but with greater accuracy.
Analog Feel: In practical terms, 32-bit resolution (over 4 billion steps) greatly exceeds the resolution of any physical knob or human motion – it’s essentially continuous for all intents and purposes. The MIDI association highlighted that 32-bit gives controls a smooth, continuous “analog” feel. For example, a filter cutoff sweep will not step even slightly; you could have thousands of incremental steps between what used to be two adjacent MIDI values. And because the data is integer (no floating point issues, just large ints), devices can implement it without ambiguity.
Center values and offsets: With higher resolution, representing things like center (e.g., pitch bend zero, pan center) needs careful mapping. The spec defines that the center of an even-numbered range should map exactly. For instance, 14-bit pitch bend center was 8192; in 32-bit it should be 2147483648 (0x8000_0000). They ensure that scaling preserves the center value by a formula (Highest+1)/2. Indeed, if highest is 0xFFFFFFFF (4294967295), half of that rounded up is 2147483648, which would be used as center (the LSB might be unused so it’s an exact midpoint).
In summary, MIDI 2.0’s increased data resolution means:
- Controllers and parameters that were 7-bit are now 32-bit (giving 4,294,967,296 discrete values).
- Key dimensions like Note Velocity, which was 7-bit, is now 16-bit (65,536 values), allowing extremely fine dynamic control.
- Pitch bend can be up to 32-bit (although devices might internally use 16 or 21 bits), eliminating stair-stepping in bends or vibrato.
- The protocol defines exact bit scaling so that values map consistently up and down (so a 7-bit only device still interoperates correctly).
- Musically, the benefit is smoother curves and more precise expression. Long filter sweeps, slow fades, LFO modulations, etc., become silk-smooth. Per-note adjustments can be incredibly subtle, enabling expressive techniques like morphing one note’s timbre slightly differently than another’s in the same chord.
Developers have to manage these larger numbers, but today’s computing easily handles 32-bit math. The payoff is that MIDI 2.0 messages carry a richness of data that can closely mirror the nuance of analog modulation or the detail of high-resolution automation. It essentially removes the MIDI data resolution as a bottleneck for expressiveness.