ADC Protocol Draft 0.8



1      Intro. 3

1.1       About 3

1.2       Credits. 3

2      Structure. 3

2.1       General 3

2.2       Message Layout 4

2.2.1        Message types. 4

2.2.2        Client identification (CID) 5

2.3       Files. 5

2.3.1        File names and structure. 5

2.3.2        Hashes. 5

2.3.3        File list 6

3      BASE implementation. 6

3.1       Client – Hub communication. 6

3.2       Client – Client communication. 7

3.3       Actions. 7

3.3.1        STA.. 7

3.3.2        SUP. 8

3.3.3        INF. 8

3.3.4        MSG.. 10

3.3.5        SCH.. 10

3.3.6        RES. 10

3.3.7        CTM... 11

3.3.8        RCM... 11

3.3.9        GPA.. 11

3.3.10      PAS. 11

3.3.11      QUI 11

3.3.12      DSC.. 12

3.3.13      GET. 12

3.3.14      GFI 12

3.3.15      SND.. 12

3.3.16      NTD.. 13

4      Examples. 13

4.1       Client – Hub connection. 13

4.2       Client – Client connection. 13

5      Standard Extensions. 13

5.1       REGEX.. 13

5.2       ZLIB.. 14

5.2.1        ZLIB-FULL. 14

5.2.2        ZLIB-GET. 14

5.3       UCMD.. 14


1         Intro

1.1      About

This is a text protocol for a DC style network that I could support. What I'm after is a simple protocol that doesn't require very much effort neither in hub nor client, and is yet extensible. It addresses some of the issues in the NMDC protocol, the most interesting being extensibility and hub bandwidth. The same protocol structure is used both for client-hub and client-client communication. This document is split into two parts, the first shows the structure of the protocol, while the second implements a specific system using this structure. ADC stands for anything you would like it to stand for, Advanced DC is the first neutral thing that springs to mind, apart from the obvious =).

1.2      Credits

Many ideas for this I’ve taken from Jan Vidar Krey’s DCTNG draft, others come from the DC dev hub people (notably cologic, fusbar and sedulus). Oh, and not to forget, Jon Hess for the original DC idea.

2         Structure

2.1      General

2.2      Message Layout


The typical message looks like this:


XYYY myCID <targetCID> p0 p1 ... pn AAq0 AAq1 ... XXqn\n



Message type




CID of the sender


Target CID, type D messages only

p0 … pn

Positional parameters, these are always mandatory

q0 … qn

Named parameters, these are optional unless otherwise noted. Each name is a two-character code that identifies the named parameter followed by it’s value. Named parameters may also take the empty value, usually meaning that their effect is being withdrawn (for example OP rights removed for an op). Names may be reused with different meanings in different commands, but developers must strive to avoid name clashes in the relatively limited namespace within the context of a command, for example by publishing names and their effects on the ADC board (wherever that may be =).


Since action is separated from the message type, the client should ignore the type, and only look at the three action letters, although some sanity check filtering should be done to ensure proper operation even with buggy clients / hubs. This allows clients to support features sent in new ways without changing the hub (search targeted at one user for instance).

It is valid to send unknown messages, but it is preferred that they’re preceded with proper SUP to avoid sending garbage that nobody understands anyway. Other clients can be notified of extended features by adding flags to the INF.

All messages must have the originating CID specified as first parameter. Type D must also have the target CID as second parameter.

Each named parameter has the form AAyyy where AA are two upper-case letters and yyy some arbitrary data associated with the parameter. Named parameters are used to add special processing options to commands, and if a flag requires that the other party interprets the command in a non-standard way (compression for instance), a SUP is required to make sure both parties understand the flag correctly.

2.2.1      Message types


Active broadcast. Message should be broadcast to all UDP active clients.


Broadcast. Hub should send message to all connected clients.


Client message. All TCP client-client messages are sent like this (hubs will never see this type).


Direct message. The target CID must be inserted after the myCID but before the other parameters of the action. Apart from sending the message to the target, an exact copy must always be sent to the source to confirm that the hub has correctly processed the message.


Info message. This message originated from the hub. myCID will always by the CID of the hub.


Hub message. This message is intended for the hub only, not relayed to other clients.


Passive broadcast. Message should be broadcast to all UDP passive clients.


UDP message. Message is sent directly with UDP to the target client (hubs will never see this type).


2.2.2      Client identification (CID)

Connected clients are identified by a CID (Client IDentification), which globally and uniquely identifies a particular user. It is invalid for two clients with the same CID to connect to the same hub, and hubs must enforce this. Clients should also use the same CID when connecting to multiple hubs. If clients offer different shares on different hubs, they must keep track of where a connecting client comes from so that the correct files always will be available. Clients should also strive to keep the same CID between sessions, to ease the implementation of favorite users and queue handling.

CID’s are 64 bits in length, and should be generated using the DCE UUID standard (several libraries exist for this) and then XOR’ing the high and low 64-bit parts together.

It is up to the hub developer to decide whether to base hub registration on CID or nickname (during login, the client (usually) provides both), but the latter is probably more convenient for the users.

2.3      Files

2.3.1      File names and structure

Filenames are relative to a fictive root in the user’s share. The ‘/’ is used to separate directories, and each file or directory name must be unique in a case-insensitive context. Any viewable characters (including space, char code >= 32) are valid names for files, the ‘/’ is escaped with the ‘\’. Clients must then take care to properly filter the filename for the target file system, but must be ready to request filenames from other clients according to these rules. The special names ‘.’ and ‘..’ may not occur as a directory or filename, any file list received containing those must be completely ignored. The shared files are identified relative to the unnamed root ‘/’ (“/dir/subdir/filename.ext”), while extensions can extend on this namespace by adding a named root (“TTHR/<root-base32>” is for example be used to locate a file in the share by TTH root instead of filename), preferably using their SUP name. Rootless filenames are treated as special (they may not appear in the file listing), and can be used to supply binary transfers of arbitrary data but should not be used to avoid polluting the namespace by using a named root. The special, rootless, filename “files.xml” specifies the file listing, uncompressed, in XML using the uft-8 encoding. Clients can then compress this list and offer the compressed one on a SUP basis. I recommend bzip2 or generic zlib compressed transfers for this task, although the uncompressed list must always be available.

2.3.2      Hashes

Hashing is mandatory for files shared in an ADC client. For files, the merkle hash tree, as described by, is used to create a full tree of hashes. The Tiger algorithm, as described by is used as hash algorithm and a base segment size of 1024 bytes must be used when generating the tree, but the clients may discard as many levels as they see fit to conserve space requirements for the tree data. A minimum of 6 levels of leaf data should always be available so that partially downloaded files can be verified and repaired (for files larger than what can be covered by that number of levels with the base block size), but in general, more levels allow for a more fine grained verification, and considering that bandwidth is usually a lot more expensive than storage, it makes sense to keep more tree data.

Generally, the root of the tree is used to identify a file uniquely within the network. It is used for searches and must always be present in the file list (incidentally, the root of the file list must also be available, and is discoverable by using GFI). The rest of the tree may be requested at a later stage using the normal client-client transfer procedure. The root is always encoded using base-32 encoding when converted to text.

2.3.3      File list

files.xml is the list of files intended for browsing. It has the following general structure:


<?xml version="1.0" encoding="utf-8" standalone="yes"?>

<FileListing Version="1" Generator="DC++ 0.401">

  <Directory Name="share">

    <Directory Name="DC++ Prerelease">

      <File Name="DCPlusPlus.pdb" Size="17648640" TTH=”xxxxxxxxx”/>

      <File Name="DCPlusPlus.exe" Size="946176" TTH=”xxxxxxxxx”/>


    <File Name="ADC.doc" Size="154112" TTH=”xxxxxxxxx”/>




“encoding” must always be set to utf-8. Clients must be prepared to handle xml files with a BOM (byte order mark). If no byte order mark is present, clients must default to little endian (intel) byte order.

 “Version” is not intended to change unless a breaking change is done to the structure of the file.

“Generator” is for statistical and informative purposes only and should not be used for extra content discovery.

“TTH” is the base32 encoded tth root of the file.

More information may be added to the file by extensions, but is not guaranteed to be interpreted by other clients.

3         BASE implementation

Each message is specified as the action code and the message type contexts under which it is valid. This particular implementation is known as BASE, as far as protocol identification is concerned. All ADC clients/hubs should support this minimum of functionality, extending as necessary. The connecting party will from now on be known as client, the other as server. It is always the server that controls state transitions.

The message types are merely a pointer to where the commands are most likely to appear, but clients should be prepared that they might arrive in other ways (for example type D or C searches to search a particular client).

For client-client communication, this protocol is identified by the string “ADC/1.0”.

In the descriptions of the commands, the mandatory <from-CID> and trailing named parameters have been omitted.

3.1      Client – Hub communication

During login, the client goes through a number of stages. An action is valid only in the NORMAL stage unless otherwise noted. The stages, in login order, are PROTOCOL (feature support discovery), IDENTIFY (user identification, static checks), VERIFY (password check), NORMAL (normal operation). Any error in hub communication means disconnection, hopefully preceded by an STA action.

3.2      Client – Client communication

The client – client messages use essentially the same stages as client – hub, but probably without VERIFY (Client access passwords are not supported in BASE), and an additional DATA state.

3.3      Actions

3.3.1      STA

STA <code> <param1>…<paramN> <description>

Types: C, D, I

States: All


Status code in the form “xyy” where x specifies severity, and yy the specific error code. The severity and error code is treated separately, i e the same error could occur at different severity levels.


Severity values:

0 Success (used for confirming commands)

1 Recoverable (error but no disconnect)

2 Fatal (disconnect)


Error codes:

00 Generic, show description

x0 Same as 00, but categorized according to the rough structure set below

10 Generic hub error

11 Hub full

12 Hub disabled

20 Generic login/access error

21 Nick invalid

22 Nick taken

23 Invalid username / password combination

24 CID taken

25 Access denied, param1 The FOURCC. Sent when a user is not allowed to execute a particular command

26 Registered users only

30 Kicks/bans/disconnects generic

31 Permanently banned

32 Temporarily banned, param1is an integer specifying the number of seconds left until it expires (This is used for kick as well…).

40 Protocol error

41 Transfer protocol unsupported, param1 the token, param2 the protocol string. The client receiving a CTM or RCM should send this if it doesn’t support the C-C protocol.

42 Required INF field missing/bad, param1 specifies the field.

43 Invalid state, param1 the FOURCC.

50 Client-client / file transfer error

51 File not available

52 File part not available

53 Slots full



Text description of the error, suitable for viewing directly to the user


Even if an error code is unknown by the client, it should display the text message alone. Error codes are used so that the client can take different action on different errors. Most error codes don’t have parameters and only make sense in C and I types. Error responses should not be sent for obvious errors (a passive client sending a CTM for example).


3.3.2      SUP

SUP <+|-><feature1>...<+|-><featureN>

Types: C, H, I


This command identifies which features a specific client / hub supports. The feature name should use only upper case letters, and possible a number to signal a revised feature. A central register of known features should be kept, to avoid clashes. All ADC clients should support the BASE feature (unless a future revision takes its place), which is this protocol. The resulting features used by two peers should be the intersection of features sent by the respective parties.

This command can also be used to dynamically add / remove features, ‘+’ meaning add and ‘-’ remove. For those commands that break or modify compatibility in some way (compression for example), the receiving end must verify with an equivalent SUP command, and the new feature set will be valid from that point. No other commands must be sent until the response has been received, to determine whether the other end actually supports the feature.

When the server receives this message the first time, it should reply with the same, send an INF about itself and move to the IDENTIFY state. The client, when it receives it the first time, should send an INF about itself.


3.3.3      INF


Types: B, C, I


This command updates the information about a client. Each time this is received, it means that the fields specified have been added or updated. Each field is identified by two characters, directly followed by the data associated with that field. A field (and the effects of its presence) can be canceled by sending the field name without data. Clients should ignore any fields they don’t know, so that fields safely can be added in the future. Most of these fields are only interesting in the client-hub communication, during client-client this command is mainly used for identification purposes. Hubs can choose to require or ignore any or all of these fields; clients must work without any of them. Many of these fields, such as share size or client version, are purely informative heuristics, and should be taken with a grain of salt, as it is very easy to fake them. On the other hand, clients should strive to provide accurate data for the general health of the system, as providing invalid information probably will annoy a great deal of people. Updates are made in an incremental manner, by sending only the fields that have changed.



IPv4 address without port. A zero address ( means that the server should replace it with the real IP of the client.


IPv6 address without port. A zero address ([::]) means that the server should replace it with the real IP of the client.


Client UDP port. Sending this field to the hub with a port means that this client wants to run in active mode. If this field is missing (or empty if changing modes), it means that the client should be treated as passive.


Same as U4, but for IPv6.


Share size in bytes, integer.


Number of shared files, integer


Client identification, version (client specific, recommended a short identifier then a float for version number). It is important that hubs don’t discriminate clients based on their VE tag but instead rely on SUP when it comes to which clients should be allowed (for example, “we only want clients that can hash”). VE is there mainly for informative reasons, and can perhaps be used to warn users that they’re using a known buggy or vulnerable client.


Maximum upload speed, bits/sec, integer


Upload slots open, integer


Automatic slot allocator speed limit, bytes/sec, integer. This is the recommended method of slot allocation, the client keeps opening slots as long as its total upload speed doesn’t exceed this value. SL then serves as a minimum number of slots open.


Maximum number of slots open in automatic slot manager mode, integer.


E-Mail, string.


Nickname, string. Hub must ensure that this is unique (case insensitive) in each hub, to avoid confusion. Valid are all displayable characters (char code > 32) apart from space, although hubs are free to limit this further as they like with an appropriate error message.


Description, string. Valid are all displayable characters (char code >= 32).


Hubs where user is a normal user, integer.


Hubs where user is registered (had to supply password), integer.


Hubs where user is op in, integer.


Token (used with CTM) in the c-c connection.





2=Extended away, don’t care about main chat either (hubs can skip sending MSG commands if they want)

(Other away modes are reserved for the future)




1=Hidden, should not be shown on the user list.


1=Hub, this INF is about the hub itself


Hubs are welcome to mandate or discard any and all fields, but obviously the more the merrier (and clients could be disconnected for not sending some of them…).

Note; normally one would only accept an IP (I4 or I6) that is the same as the source IP of the connecting peer, allowing otherwise for trusted users only because your could channel DDoS attacks. Use caution when accepting unknown IPs. Only for trusted users one may allow a different IP or an IP from a different domain (IPv4 or IPv6) to be specified. If you fail to do this, your hub can be used as a medium for DDoS attacks.

When a server receives this in the IDENTIFY state, it should respond with an INF about itself and proceed to the VERIFY state by sending a PAS request or NORMAL state by starting sending the INF of all clients, where the INF of the connecting client must come last. When the hub that sends an INF about itself, the NI becomes hub name, VE version etc.

3.3.4      MSG

MSG <text>

Types: A, B, D, I, P

A chat message. The receiving clients should precede it with ‘<’ nick ‘>’, to allow for uniform displaying of messages. The client should not send its own nick in the text.



Private message, <group-CID> is the reply-to CID, and should be shown as header for the chat (window title, etc). This is used to implement group discussions such as op-chat. Must contain the originating CID if this is a normal private conversation.


3.3.5      SCH


Types: P, U, D, (B), (A)

Search. Each parameter is an operator followed by a term. Each term is a two-letter code followed by the data to search for. Clients that don’t recognize a field should ignore the search.

++, --, EX

String search term, where ++ is include, -- is exclude, and EX is extension. Each filename (including the path to it) should be matched using case insensitive substring search as follows: match all ++, remove those that match any --, and make sure the extension matches at least one of the EX (if it is present). Extensions must be sent without the leading ‘.’.


Smaller than or equal size in bytes


Larger than or equal size in bytes


Exact size in bytes


Token, string. Used by the client to tell one search from the other. If present, the responding client must copy this field exactly to each search result.


Tiger tree hash root, encoded with base32.


File type, to be chosen from the following:

1 = File (default, doesn’t need specifying)

2 = Directory


Note that hubs normally only relay searches to passive clients (type P) and clients send searches to active clients by themselves using type U, which should prove a massive bandwidth saver for the hubs. Should ISP’s dislike this, a switch to type B searching is easily done.

3.3.6      RES


Types: D, U

Search result, made up of fields similar to the INF ones. It is of course better for the network if the client sends all it knows about a file, unless it’s a lot of data. Search results without size and filename are obviously useless, but if a client has hashing or any other meta-data to add, that’s only good. Passive results should be limited to 5, active to 10. To return a directory as result, make sure the name ends with a path separator.


Full filename including path


Size, in bytes


Slots currently available




Tiger tree hash root, encoded with base32.


Tiger tree depth, index of the highest level of tree data available, only root = 0, first level (2 leaves) = 1, second level = 2, etc…(this is useful when we want to verify a file and search for the most detailed tree)


3.3.7      CTM

CTM <proto> <port> TO<token>

Types: D

Connect to me. Used by active clients that want to connect to someone, or in answer to RCM. Only TCP active clients may send this. TO is a string that can be used to identify the connection in once a direct connection has been made, but is not mandatory. If present it must be passed with the initial INF during client-client connect. <proto> is an arbitrary string specifying the protocol to connect with, in the case of an ADC compliant connection attempt, this should be the string “ADC/1.0”. If this is a response to a RCM, the <token> and <proto> fields should just be copied directly (if the protocol is supported of course). If a protocol is not supported, a DSTA must be sent indicating this.

3.3.8      RCM

RCM <proto> TO<token>

Types: D

Reverse CTM. Used by passive clients to signal that they want a connection token from an active client.

3.3.9      GPA

GPA <data>

Types: I

States: VERIFY

Get Password. The parameter is 192 random binary bits (base32 encoded), used to avoid replay attacks on the password.

3.3.10 PAS

PAS <password>

Types: H

States: VERIFY

Password. The CID (in binary), then the password, followed by the random data, passed through the Tiger hash algorithm (not Tiger Tree) then base32. When validated, this moves the server into NORMAL state.

3.3.11 QUI

QUI <CID> <reason> <param1>…<paramN>

Types: I


The client identified by CID disconnected from the hub. If the CID is the same as client receiving the QUI, it means that it should take action according to the reason (i e redirect).

Reason can be one of the following:


Normal disconnect


Disconnected (friendly disconnect), param1 = originating CID, param2=message


Kicked (unfriendly disconnect), param1 = originating CID, param2=message


Banned, param1 = originating CID, param2 = seconds banned, -1 = forever, param3=message


Redirected, param1 = originating CID, param2 = redirect address, param3=message

Message is optional.


3.3.12 DSC

DSC <victim-CID> <reason>

Types: H

This is the friendly disconnect command. Kicks / Bans etc can be implemented as User Commands to allow more flexibility for those that want it.

3.3.13 GET

GET <type> <identifier> <start-pos> <bytes>

Types: C, H, I

Requests for a certain file or binary data to be transmitted. <start-pos> counts 0 as the first byte. <bytes> may be set to -1 to indicate that it is unknown. <type> is a [a-zA-Z0-9] that specifies the namespace for identifier, BASE requires that clients recognize the types “file” and “tthl”, where identifier is a filename in the share (either a file name in the anonymous root, or a tth root value in the “TTHR” root).

“file” transfers transfer the file data in binary, starting at <start-pos> and sending <bytes> bytes.

“tthl” transfers send the highest level of leaves available (the one containing the most leaves) as a binary stream of leaves, right-to-left, with no spacing in between them. <start-pos> must be set to 0 and <bytes> to -1 when requesting the data. <bytes> must contain the total binary size of the leaf stream in SND, and by dividing this length by the individual hash length, the number of leaves, and thus the leaf level can be deducted. The received leaves can then be used to reconstruct the entire tree, and the resulting root must match the root of the file (this verifies the integrity of the tree itself).

Passive clients depend on the “no slots” to be recoverable, if a client gets a recoverable error after a GET command and has nothing else to do, it must send NTD, otherwise the passive client will never get a chance at downloading if  the other client has a file queued. Note that this can also be used for binary transfers between hub and client.

3.3.14 GFI

GFI <type> <identifier>

Types: C

Get File Information, request that the other client returns a RES with relevant file data, for example size. Type and identifier are the same as for GET.

3.3.15 SND

SND <type> <identifier> <start-pos> <bytes>

Types: C, H, I

State transition to DATA state. The sender will keep on sending until <bytes> bytes of binary data have been sent, and then will put itself back to NORMAL state. The parameters are essentially a mirror of the GET parameters, but bytes must be replaced if it was -1 in the GET request.

3.3.16 NTD


Types: C

Nothing to do. This is sent by the server, to indicate that it passes control over the NORMAL state over to the other client, effectively making it the server. It is always the server that has the first say in who will transfer files, this way we don’t have to remember if we’re connecting because of a CTM or because we want to download. A client that receives NTD and has nothing to do itself should disconnect.

4         Examples

4.1      Client – Hub connection



HSUP <Client-CID> BASE <other-features>



ISUP <Hub-CID> BASE <other-features>

IINF <Hub-CID> …

BINF <Client-CID> LO1…



IGPA <Hub-CID> …

HPAS <Client-CID> …



BINF <all clients>

BINF <Client-CID> …

4.2      Client – Client connection



CSUP <CID> BASE <other-features>



CSUP <CID> BASE <other-features>


CINF <CID> TO<token>



















5         Standard Extensions

5.1      REGEX

Regular expressions in searches. Extends the SCH command with the operator RE that takes a regular expression in the (Perl? PCRE? Java? .NET? POSIX?) form.

5.2      ZLIB

ZLib compressed communication. There are two variants of zlib support, FULL and GET, and only one should be used on a each communications channel set up.

5.2.1      ZLIB-FULL

If, during initial SUP negotiation, both ends send “ZLIF” in their support string, it means that all subsequent message passing will be tunneled in one long zlib stream. Care must be taken to partially flush the zlib buffer when needed to ensure that the commands are in a decompressable state when they arrive at the other end.

5.2.2      ZLIB-GET

The alternative is to send “ZLIG” to indicate that ZLib is supported for binary transfers using the GET command, but not otherwise (memory constraints in the hub for example). A flag “ZL1” is added to the to the SND command to indicate that the data will come compressed, and the client receiving requests it by adding the same flag to GET. The <bytes> parameter of the GET and SND commands is to be interpreted as the number of uncompressed bytes to be transferred.

5.3      UCMD


User commands.