What is PTP?
Picture Transfer Protocol was drafted in 2000 to create a standardized vendor-extensible image and file transfer protocol for digital cameras and printers.
This effort was led by PIMA and many other camera manufacturers who, at the time, needed a universal protocol for transfering images from a camera to a computer over USB. At the time every company needed to invest in not only creating their own proprietary protocol, but creating drivers and software for their users.
Like many file transfer protocols at the time, PTP doesn't expose a filesystem. Instead, it exposes a generic API for listing, downloading, and uploading objects. An 'object' can be anything, but in most cases it represents a file or folder on the device's storage medium.
There's a few reasons why PTP was designed this way:
- Abstracting away the filesystem allowed the camera to store images in any way it liked (even without a filesystem)
- Removed the risk of filesystem corruption
- Allows the camera to safely modify objects while the client is downloading/uploading data
PTP is also simple and vendor-extensible. This allows OS developers to create clients that can communicate with any camera, while camera manufacturers can add additional functionality on top of PTP without breaking compatibility with all clients.
In 2005 PTP was standardized. It was updated in 2008, and again in 2013 with new features to improve performance.
In 2011, Microsoft released MTP as a superset of PTP with many more features and data types. The MTP spec is freely available and well-written, and is great documentation for anybody who wants to write a client.
How does it work?
PTP starts with the initiator, or the client. This is the device that issues commands to the responder, which can be a camera, printer, or phone.
The initiator can issue commands such as OpenSession
or GetDeviceInfo
, and the responder will reply back with a a return code, parameters, payload, etc.
Another important thing to understand is that PTP transactions consist of phases. A typical transaction will consist of a request phase, an (optional) data phase, and a response phase.
- The operation phase will always consist of a single request packet sent from the initiator.
- The data phase can consist of a data packet sent from either the initiator or responder, depending on the operation.
- The response phase will always consist of a single response packet.
For example, an operation like GetDeviceInfo
will consist of data received from the responder, while an operation like SetDevicePropValue
will have data sent from the initiator.
Note that the initiator and responder cannot both send data in the same transaction.
PtpOpenSession
Let's look at a very simple transaction from the viewpoint of the initiator. Here is the first packet we will send to the responder, after a USB connection is established:
10 00 00 00
01 00
02 10
01 00 00 00
00 00 00 00
Keep in mind that PTP data structures are encoded as little endian, and fields may not be aligned.
This might be easier to look at if we encode it as a C struct.
struct PtpContainer {
uint32_t length;
uint16_t type;
uint16_t code;
uint32_t transaction_id;
uint32_t params[5];
}command = {
.length = 16,
.type = 1,
.code = 0x1002,
.transaction_id = 0,
.params = {
0
}
};
Generally the first request sent to the device will be PtpOpenSession
or 0x1002
. This opens a session with the responder.
A session is a connection between the initator and responder. Each session has it's own state, transaction IDs, object IDs, and descriptors.
The only command allowed outside a session is GetDeviceInfo
. PTP is designed this way to allow multiple concurrent sessions from different initiators. Sadly, it doesn't seem like any PTP device
supports this feature.
- Since this packet is sent in the request phase, the
type
field for this packet isPTP_PACKET_TYPE_COMMAND
or1
. - A rule for
PtpOpenSession
: parameter 0 must contain theSessionID
, a unique ID identifying a session. This can be whatever we want. - Note that the
length
field is16
. The minimium size of a request packet is12
and the first parameter adds 4 bytes. There can be up to 5 parameters in a command packet. - A rule for for
PtpOpenSession
:transaction_id
will always be 0. The transaction ID is maintained by the initiator and incremented by1
after every completed transaction.
Since PtpOpenSession
doesn't have a data phase, the response phase is next, with a PTP_PACKET_TYPE_RESPONSE
packet (3
) issued from the responder:
0c 00 00 00
03 00
01 20
01 00 00 00
- Here we see the
code
field is0x2001
orPTP_RC_OK
. Keep in mind all PTP codes start with a prefix in the higher 16 bits. For response codes (RC) it's0x20
, for operation codes (OC) it's0x10
, etc. - Since this is packet is part of the same transaction, the transaction ID is the same.
Now, the transaction is over, and we will increment the client's internal transaction ID counter before issuing the next transaction.
Another important thing to take note of is that even though all packets are labeled with a transaction ID, a transaction must be fully completed before others are issued (in each session). You cannot issue another transaction while one is in progress. If the transaction ID changes before the transaction is over, then something is wrong.
PtpGetDeviceInfo
Next, let's have our client send a PtpGetDeviceInfo
request. This operation returns a data structure describing the device we are communicating with.
Here is the request packet we will send to the responder:
0c 00 00 00
01 00
01 10
02 00 00 00
The transaction id field is now 2
, and the code field is now set to 0x1001
.
When we read data from the responder, we should get both a data packet (PTP_PACKET_TYPE_DATA
) and a response packet (PTP_PACKET_TYPE_RESPONSE
), in that order.
The data packet is big, but is easy to decode:
00000000 0b 02 00 00 02 00 01 10 01 00 00 00 64 00 06 00 |............d...|
00000010 00 00 64 00 00 00 00 a7 00 00 00 14 10 15 10 16 |..d.............|
00000020 10 01 10 02 10 03 10 06 10 04 10 01 91 05 10 02 |................|
00000030 91 07 10 08 10 03 91 09 10 04 91 0a 10 1b 10 07 |................|
00000040 91 0c 10 0d 10 0b 10 05 91 0f 10 06 91 10 91 27 |...............'|
00000050 91 0b 91 08 91 09 91 0c 91 0e 91 0f 91 15 91 14 |................|
00000060 91 13 91 16 91 17 91 20 91 f0 91 18 91 21 91 f1 |....... .....!..|
00000070 91 1d 91 0a 91 1b 91 1c 91 1e 91 1a 91 53 91 54 |.............S.T|
00000080 91 60 91 55 91 57 91 58 91 59 91 5a 91 1f 91 fe |. .U.W.X.Y.Z....|
00000090 91 ff 91 28 91 29 91 2d 91 2e 91 2f 91 2c 91 30 |...(.).-.../.,.0|
000000a0 91 31 91 32 91 33 91 34 91 2b 91 35 91 36 91 37 |.1.2.3.4.+.5.6.7|
000000b0 91 38 91 39 91 3a 91 3b 91 3c 91 da 91 db 91 dc |.8.9.:.;.<......|
000000c0 91 dd 91 de 91 d8 91 d9 91 d7 91 d5 91 2f 90 41 |............./.A|
000000d0 91 42 91 43 91 3f 91 33 90 68 90 69 90 6a 90 6b |.B.C.?.3.h.i.j.k|
000000e0 90 6c 90 6d 90 6e 90 6f 90 3d 91 80 91 81 91 82 |.l.m.n.o.=......|
000000f0 91 83 91 84 91 85 91 40 91 01 98 02 98 03 98 04 |.......@........|
00000100 98 05 98 c0 91 c1 91 c2 91 c3 91 c4 91 c5 91 c6 |................|
00000110 91 c7 91 c8 91 c9 91 ca 91 cb 91 cc 91 ce 91 cf |................|
00000120 91 d0 91 d1 91 d2 91 e1 91 e2 91 e3 91 e4 91 e5 |................|
00000130 91 e6 91 e7 91 e8 91 e9 91 ea 91 eb 91 ec 91 ed |................|
00000140 91 ee 91 ef 91 f8 91 f9 91 f2 91 f3 91 f4 91 f7 |................|
00000150 91 22 91 23 91 24 91 f5 91 f6 91 52 90 53 90 57 |.".#.$.....R.S.W|
00000160 90 58 90 59 90 5a 90 5f 90 07 00 00 00 09 40 04 |.X.Y.Z._......@.|
00000170 40 05 40 03 40 02 40 07 40 01 c1 05 00 00 00 02 |@.@.@.@.@.......|
00000180 d4 07 d4 06 d4 03 d3 01 50 01 00 00 00 01 38 0c |........P.....8.|
00000190 00 00 00 01 30 02 30 06 30 0a 30 08 30 01 38 01 |....0.0.0.0.0.8.|
000001a0 b1 03 b1 02 bf 00 38 04 b1 05 b1 0b 43 00 61 00 |......8.....C.a.|
000001b0 6e 00 6f 00 6e 00 20 00 49 00 6e 00 63 00 2e 00 |n.o.n. .I.n.c...|
000001c0 00 00 13 43 00 61 00 6e 00 6f 00 6e 00 20 00 45 |...C.a.n.o.n. .E|
000001d0 00 4f 00 53 00 20 00 52 00 65 00 62 00 65 00 6c |.O.S. .R.e.b.e.l|
000001e0 00 20 00 54 00 36 00 00 00 08 33 00 2d 00 31 00 |. .T.6....3.-.1.|
000001f0 2e 00 32 00 2e 00 30 00 00 00 08 38 00 32 00 38 |..2...0....8.2.8|
00000200 00 61 00 66 00 35 00 36 00 00 00 |.a.f.5.6...|
0000020b
Data packets have no parameters, so the payload always starts at offset 12
.
The beginning of the payload starts with 3 fields:
uint16_t standard_version;
uint32_t vendor_ext_id;
uint16_t version;
What follows next are 6 PTP_TC_UINT16ARRAY
arrays describing what codes the device supports. Parsing these is as simple as
reading the first 4 bytes as the array length (uint32_t
), reading the next length * 2
bytes as uint16_t
.
Next are 4 strings. Almost all strings in PTP are encoded as wide strings. The first byte is read as the length of the string (in chars),
and the rest are a series of wide chars (int16_t
) with a null terminator. The length field includes the null terminator.
Sending data to the responder
For example, sending a SetDevicePropValue
request:
10 00 00 00
01 00
16 10
03 00 00 00
05 50 00 00
The first parameter is 0x5005
, or PTP_PC_WhiteBalance
As described in the MTP spec, PTP_PC_WhiteBalance
is a UINT16
data type.
So the data packet in this case will be 2 bytes:
0e 00 00 00
02 00
16 10
03 00 00 00
01 00
This sets PTP_PC_WhiteBalance
to 0x0001
, or 'Manual'.
After this, we expect a response packet from the responder.
What's next?
The packet structure described in this document is for PTP/USB only. PTP/IP has a totally different packet structure and way of sending packets. Thankfully, I have started to document it here.
Notes on USB
A PTP/USB device should have at least two endpoints - one for sending data and the other for receiving. There may also be an interrupt endpoint. This is a read only endpoint that the client may poll to get event codes.
The class for PTP devices should be 0x06
for Still Imaging devices, as described by the USB forum
vcam
vcam is an responder implementation of PTP, MTP, and other proprietary PTP extensions that can be used with unmodified PTP clients. This may be useful if you want test yours.