SMS TPDU Transfer Protocol Data Unit

April 1st, 2024

The SMS subsystem is quite powerful and offers even binary data transmission. By switching the modem to the PDU mode one gets a convenient way to access most of the features. Yet interpreting or assembling of PDUs may be quite a desolation.

> AT+CMGF=<mode>

The value of 0 activates the PDU mode, while the value of 1 activates the text mode.

Configuring the storage memory

> AT+CPMS=<read_and_delete>[,<write_and_send>[,<incoming>]]
Memory Affected AT Commands Possible Values
read_and_delete +CMGL, +CMGR, +CMGD ME = Modem's Flash Memory, MT = ME and SM combined, SM = SIM Card, SR = Status Report Storage
write_and_send +CMGW, +CMSS ME = Modem's Flash Memory, MT = ME and SM combined, SM = SIM Card
incoming none ME = Modem's Flash Memory, SM = SIM Card

Reading SMS messages

List messages by type from the active storage memory:

> AT+CMGL=<stat>

< +CMGL:
< <index>,<stat>,[<alpha>],<length>
< <pdu>
<
< +CMGL:
< <index>,<stat>,[<alpha>],<length>
< <pdu>
< 
< ...
< 
< OK
Parameter Description
stat 0 = received unread message(s), 1 = received read message(s), 2 = stored unsent message(s), 3 = stored sent message(s), 4 = all messages
index 0..n position in the active storage memory
alpha Phonebook entry associated to the originating or destination address, depending on the message type.
length Length in octets of the PDU without the SMSC part.
pdu SMSC and the PDU as described below, encoded as a hex-string (each two characters make up one byte).

Read selected message by its index no. as of the active storage memory:

> AT+CMGR=<index>

< +CMGR: <stat>,[<alpha>],<length>
< <pdu>
< 
< OK

Fields description as for the +CMGL command.

Sending SMS messages

> AT+CMGS=<length>
> <pdu><CTRL-Z or ESC>

Note, the length does not include the SMSC part.

A CTRL-Z signal confirms data entry and requests the modem to send the SMS. An ESC signal cancels the transaction.

PDU structure

General PDU Structure

The general PDU structure consists of the SMSC part, TPDU type and the actual TPDU. Note, the length values in AT Commands do not include the SMSC part. The SMSC part has been added in the GSM Phase 2 and its coding differs from the regular originating/destination address fields. I could not find the original definition of the field's format, only single references in:

Field Data Type Description
SCA Length Integer [0..22] Length of Service Center Address (SMSC): counting the two following fields. Zero denotes no SMSC data (a default one to be used for sending).
SCA Type of Address Bit-coded See Type of Address below.
SCA SCA Service Center Address (SMSC), encoded as specified in SCA Type of Address.
TPDU Type Bit-coded Denotes the actual type of the TPDU, see below.

TPDU Types

TPDU Types

The TP-MTI Message Type Indicator specifies the TPDU type:

TP-MTI Value TPDU Type if direction SC-to-MS TPDU Type if direction MS-to-SC
00 SMS-DELIVER SMS-DELIVER-REPORT
01 SMS-SUBMIT-REPORT SMS-SUBMIT
10 SMS-STATUS-REPORT SMS-COMMAND
11 reserved reserved

Other fields:

Field Available Values
TP-RD Reject Duplicates 0 = SC to accept a message of the same TP-MR and TP-DA, 1 = SC to reject a message of the same TP-MR and TP-DA still held at SC
TP-MMS More Messages to Send 0 = more messages are waiting for the MS in this SC. 1 = otherwise
TP-VPF Validity Period Format 00 = not present, 01 = enhanced format, 10 = relative format, 11 = absolute format
TP-SSR Status Report Request 0 = not requested, 1 = requested
TP-SRI Status Report Indication 0 = shall not be returned to SME, 1 = shall be returned to SME
TP-UDHI User Data Header Indication 0 = no header, 1 = beginning of User Data contains a header
TP-RP Reply Path 0 = reply path parameter not set, 1 = reply path parameter set

Submit and Deliver PDUs

Submit and Deliver PDUs

The Submit PDU is a request to send a text message, while the Deliver PDU is the form as seen by the recipient. Both share a few (nearly) identical fields:

Field Submit Deliver Data Type Description
TP-MR Message Reference * Integer [0..255] Like a unique message identifier, relevant for delivery reports.
TP-DA Destination Address * Address Type Addressee of the message. Maximum field size is 12 octets (TS 123 040, Chapter 9.1.2.5 Address Fields)
TP-PID Protocol Identifier * * Bit-coded See below.
TP-DCS Data Coding Scheme * * Bit-coded See below.
TP-VP Validity Period o Integer [0..255] or Timestamp Data type depends on TP-VFP value. For the Timestamp data type: see below.
TP-UDL User Data Length * * Integer [0..160] Units depend on TP-DCS value.
TP-UD User Data * * - GSM 7-bit coded text or UTF-16 coded text or binary data. Specified in TP-DCS field. Optionally containing a header, as indicated in TP-MTI.
TP-OA Originating Address * Address Type Recipient of the message.
TP-SCTS Service Centre Time Stamp * Timestamp Point of time as received by the SMSC. For the Timestamp data type: see below.

Protocol Identifier

Bits 7:6 Description
00 Bit 5 is 0 = no telematic interworking or 1 = telematic interworking.
Bits 4:0 can be equal to 00000, meaning implicit device type (specific to this SC, or can be concluded on the basis of the address). Other values specified in TS 123 040, Page 65, Chapter 9.2.3.9 TP-Protocol-Identifier (TP-PID).
01 Bits 5:0 specify handling of different message types, see TS 123 040, Page 66, Chapter 9.2.3.9 TP-Protocol-Identifier (TP-PID).
10 reserved
11 SC-specific use.

Data Coding Scheme

Bits 7:6 Bit 5 Bit 4 Bits 3:2 Bits 1:0 Description
00 Compressed-flag
0 = not compressed
1 = compressed
Message-Class-flag
0 = none
1 = see bits 1:0
Alphabet
00 = default (GSM7)
01 = 8-bit
10 = UCS2 (UTF-16)
11 = reserved
Message-Class
00 = class 0
01 = ME-specific
10 = SIM-specific
11 = TE-specific
A value of 00000000 denotes the default alphabet in GSM Phase 2.
01 x x x x reserved
10 x x x x reserved
11 0 1 Indication:0
0 = inactive
1 = active
Waiting
00 = voice mail
01 = fax
10 = email
11 = other
Default alphabet.
11 1 0 x x As above 1101xxxx, but UCS2 alphabet.
11 1 1 0:Alphabet
0 = default
1 = 8-bit
Class. Data.

Validity Period

TP-VPF Value Description
00 TP-VP field not present
01 TP-VP field present - enhanced format
10 TP-VP field present - relative format
11 TP-VP field present - absolute format (see Timestamp)

Relative Validity Period

It's a one-octet value defined as follows:

TP-VP Value Description
0..143 (TP-VP + 1) x 5 minutes
144..167 12 hours + ((TP-VP -143) x 30 minutes)
168..196 (TP-VP - 166) x 1 day
197..255 (TP-VP - 192) x 1 week

Enhanced Validity Period

It's a 7-octet field (alway 7-octet, no matter how many are used) with following content:

Fist octet of TP-VP Description Following octets of TP-VP
Bit 7: Extension 1 = additional functionality indicator present, 0 = otherwise Either another functionality indicator or the value specified by bits 2:0.
Bit 6: Single shot SM 1 = SC should make one delivery attempt
Bits 5:3: reserved
Bits 2:0 = 000 No validity period specified.
Bits 2:0 = 001 Relative Validity Period. Same format as for a regular relative validity period.
Bits 2:0 = 010 Relative Validity Period as number of seconds. Integer 0..255, while 0 is a reserved value.
Bits 2:0 = 011 Relative Validity Period as number of hours, minutes and seconds. Three octets containing BCD-encoded hours, minutes and seconds as in the TP-SCTS field.
Bits 2:0 other values reserved reserved

Timestamp

The field looks like presented on the Submit and Deliver PDUs diagram and contains date, time and a time zone value.

Field Description
Y2:Y1 BCD-encoded last digits of the year.
M2:M1 BCD-encoded month.
D2:D1 BCD-encoded day of month.
h2:h1 BCD-encoded hour (24-hours format).
m2:m1 BCD-encoded minutes.
s2:s1 BCD-encoded seconds.
TZ2:TZ1 BCD-encoded Time Zone with a sign. Most-significant bit of the TZ1, which is the bit 3 (0-based) of the octet is 0 for positive value or 1 for a negative value.
The Time Zone is a number of 15-minutes intervals (e.g., a value of 0x80, which are 8 quarters would mean +2 hours).

Destination and Originating Addresses

Field Data Type Description
Address Length Integer [1..20] Number of usable digits (semi-octets). Therefore, number of octets = ceil(Address Length / 2).
Type of Address Bit-coded See below.
Address BCD Binary coded decimal; lower 4 bits first; padding with 1111.
1010 = "*"; 1011 = "#"; 1100 = "a"; 1101 = "b"; 1110 = "c"

Type of Address

Bit 7 is fixed to 1, meaning no extension.

Bits 6:4
Type of Number
Description
000 Unknown number (format according to the network plan; may contain prefix or escape digits).
001 International number (leading + must be added manually though).
010 National number (no prefix and no escape digits).
011 Network-specific number (administrative/service number specific to the servicing network).
100 Subscriber number (short number representation if known to the SC).
101 Alphanumeric (GSM7 alphabet).
110 Abbreviated number.
111 reserved
Bits 3:0
Numbering Plan
Description
0000 Unknown. Valid for alphanumeric Type of Number.
0001 ISDN/telephone numbering plan (E.164 [17]/E.163[18])
0011 Data numbering plan (X.121)
0100 Telex numbering plan
0101 Service Centre Specific plan (SM-TL addressing only)
0110 Service Centre Specific plan (SM-TL addressing only)
1000 National numbering plan
1001 Private numbering plan
1010 ERMES numbering plan (ETSI DE/PS 3 01-3)
1111 Reserved for extension
All other values reserved

Address (BCD)

Address Digits

The above diagram depicts the way a phone number gets encoded. An odd-length number is filled-up with a 1111 value (F-hex). An even-length number consumes all octets.

User Data Header

User Data Header

Field Data Type Description
TP-UDHL User Data Header Length Integer [0..n] Number of octets consumed by the header (excluding the length field itself and excluding any fill bits).
IEI Information Element Identifier Enum See below.
IED Length Integer [0..n] Number of octets consumed by the data of this element (excluding the IEI and excluding the length field itself).
IED Information Element Data - Data specific to the IEI.

There can be any number of IEIs. In such case the group of fields: IEI, IED Length and IED is repeated multiple times within the header.

If the User Data is encoded using the GSM7 (7-bit alphabet), the beginning of the data must be aligned to the 7-bit grid. Therefore, depending on the header size a number of fill-bits may be necessary to align the fields. The reason is to support older devices not understanding the header, still being able to decode the rest of the message.

Information Element Identifier

The table on the Page 76 of the TS 123 040 document (9.2.3.24 TP-User Data (TP-UD) chapter) describes possible values of the IEI field. The only relevant value seems to be the 0x00 denoting a concatenated message. Others include special SMS, application addressing schemes, sound/melody, animation/picture, e-mail header, voice mail etc.

Octets and Septets Alignment

Octets and Septets Alignment

The best way to understand the bit-alignment of octets and septets is to look at the data in the reverse order. Septets, organized from the right to the left, fit into octets from the right to the left just by cutting the every 8 bits.

Multi-Part SMS (Concatenated)

Multi-Part Concatenated SMS

The above diagram is an example of a concatenated SMS with GSM7 encoding.

Field Example Value Description
TP-UDHL 0x05 The header data consists of 5 octets.
IEI 0x00 The first (and only) IEI of value 0x00 denotes a concatenated SMS.
IED Length 0x03 This IEI has 3 fixed fields, therefore the length.
Reference No. 0x20 This message has a unique number 0x20. The recipient requires this number to match all fragments.
Total fragments 0x03 The complete message consists of 3 fragments. A value of 0 requires the recipient to ignore the complete IEI.
Sequence No. 0x01 This fragment is the first one in the sequence. A value of 0 or a greater one than the total number of fragments requires the recipient to ignore the complete IEI.
Fill 1 bit To match the septet-interval, a single bit must be added before continuing with the user data.

References


Next: Tracker Project

Previous: Processing emails using IMAP and Thunderbird files

Main Menu