Aug, 2020

MPTCP Integer Overflow Vulnerability

In this blog, we will share an integer overflow vulnerability in the MPTCP module in the XNU kernel. 

When we started to study MPTCP, we got a very brief description from the official document:

“MPTCP is a set of extensions to the Transmission Control Protocol (TCP) specification. With MPTCP, a client can connect to the same destination host with multiple connections over different network adapters”.

Now a natural question comes into our mind: how many connections can a client connect to a host at most?  With this question in mind, we created a simple test program that simply creates an MPTCP socket and connects to a host many times. Our purpose is to figure out when we cannot create new connections. 

The test program ran fine. However, the surprise thing was that we triggered a kernel panic when the test program exited. 

The test program was so simple that we had no clue about what triggered the panic. After analyzing the panic log, we realized that our program triggered a recursive kernel function and resulted in a kernel stack exhaustion when the MPTCP socket was closed.  Note that the recursive function panic was also already fixed. You won’t be able to trigger it on iOS 13.

We continued our testing process. Now, we turned to the XNU source code. We quickly found the following data structure. 

struct mptses {
	struct mppcb    *mpte_mppcb;            /* back ptr to multipath PCB */
	struct mptcb    *mpte_mptcb;            /* ptr to MPTCP PCB */
	TAILQ_HEAD(, mptopt) mpte_sopts;        /* list of socket options */
	TAILQ_HEAD(, mptsub) mpte_subflows;     /* list of subflows */
	uint16_t        mpte_numflows;          /* # of subflows in list */
	uint16_t        mpte_nummpcapflows;     /* # of MP_CAP subflows */
	sae_associd_t   mpte_associd;           /* MPTCP association ID */
	sae_connid_t    mpte_connid_last;       /* last used connection ID */
...

mptses represents MPTCP sessions. Every time a new connection is created between a client and a host, there will be a new mpte_subflow created. mptses->mpte_numflows records the number of subflows.  

static void
mptcp_subflow_attach(struct mptses *mpte, struct mptsub *mpts, struct socket *so)
{
	struct socket *mp_so = mpte->mpte_mppcb->mpp_socket;
	struct tcpcb *tp = sototcpcb(so);

...
	/*
	 * Insert the subflow into the list, and associate the MPTCP PCB
	 * as well as the the subflow socket.  From this point on, removing
	 * the subflow needs to be done via mptcp_subflow_del().
	 */
	TAILQ_INSERT_TAIL(&mpte->mpte_subflows, mpts, mpts_entry);
	mpte->mpte_numflows++; //<====== no integer overflow checks

As we can see in function mptcp_subflow_attach, creating a new connection will increase mpte->mpte_numflows by one, but there is no integer overflow checks at all.

You may also notice that, mpte_numflows is in the type of uint16_t, which means its maximum value is 0xFFFF!  So what if we create 0xFFFF+2 connections? The answer is that mpte_numflows will wrap to 1!

So far, the integer overflow doesn’t cause any memory errors. We continued to check how mpte_numflows would be used. Just by greping  mpte_numflows, we got the following sysctl handler: mptcp_pcblist

static int
mptcp_pcblist SYSCTL_HANDLER_ARGS
{
...
	TAILQ_FOREACH(mpp, &mtcbinfo.mppi_pcbs, mpp_entry) {
		flows = NULL;
		socket_lock(mpp->mpp_socket, 1);
		VERIFY(mpp->mpp_flags & MPP_ATTACHED);
		mpte = mptompte(mpp);

...
		mptcpci.mptcpci_nflows = mpte->mpte_numflows;
...
		len = sizeof(*flows) * mpte->mpte_numflows;
		if (mpte->mpte_numflows != 0) {
			flows = _MALLOC(len, M_TEMP, M_WAITOK | M_ZERO);  
//<=== alloc memory according to mpte->mpte_numflows
...
		f = 0;
		TAILQ_FOREACH(mpts, &mpte->mpte_subflows, mpts_entry) {
			so = mpts->mpts_socket;
			fill_mptcp_subflow(so, &flows[f], mpts);
 <== dump the list into flows buffer. HEAP OVERFLOW!
			f++;
		

In function mptcp_pcblist, mpte_numflows is used to calculate the length of a temp buffer. If we already make  mpte_numflows wrapped to 1, the allocation site will only allocate ONE entry. However,  mptcp_pcblist will traverse the list mpte_subflows and dump all the entries into the allocated buffer. Heap overflow happens! 

We won’t get into the exploitation phase. With partially controlled values and partially controlled length, the exploitation would be also very interesting. 

Fixing the issue is quite easy. The patch is as follows. mptcp_subflow_add function now adds a limitation to mpte_numflows.

Do you still remember the question at the beginning? How many connections does an MPTCP socket allow? Now, we got the answer:

#define MPTCP_MAX_NUM_SUBFLOWS 256

Credit: The integer overflow was discovered and analyzed by Tao Huang and Tielei Wang of Pangu Lab.

Thanks for reading!

sockaddr->sa_len的痛

0x00 引言

sockaddr是xnu内核中一个很普通的数据结构,用于描述socket地址的基本属性,包括地址长度及其所属family类型。结构体具体定义如下:

struct sockaddr {
	__uint8_t       sa_len;         /* total length */   
	sa_family_t     sa_family;      /* [XSI] address family */
	char            sa_data[14];    /* [XSI] addr value (actually larger) */
};

由于xnu支持多种socket类型,不同类型的socket使用的sockaddr长度可能不同,xnu中为每种sockaddr都有具体定义。例如,下面分别是sockaddr_insockaddr_in6sockaddr_unsockaddr_ctl 的结构。

struct sockaddr_in {
	__uint8_t       sin_len;
	sa_family_t     sin_family;
	in_port_t       sin_port;
	struct  in_addr sin_addr;
	char            sin_zero[8];
};
struct sockaddr_in6 {
	__uint8_t       sin6_len;       /* length of this struct(sa_family_t) */
	sa_family_t     sin6_family;    /* AF_INET6 (sa_family_t) */
	in_port_t       sin6_port;      /* Transport layer port # (in_port_t) */
	__uint32_t      sin6_flowinfo;  /* IP6 flow information */
	struct in6_addr sin6_addr;      /* IP6 address */
	__uint32_t      sin6_scope_id;  /* scope zone index */
};
struct  sockaddr_un {
	unsigned char   sun_len;        /* sockaddr len including null */
	sa_family_t     sun_family;     /* [XSI] AF_UNIX */
	char            sun_path[104];  /* [XSI] path name (gag) */
};
struct sockaddr_ctl {
	u_char      sc_len;     /* depends on size of bundle ID string */
	u_char      sc_family;  /* AF_SYSTEM */
	u_int16_t   ss_sysaddr; /* AF_SYS_KERNCONTROL */
	u_int32_t   sc_id;      /* Controller unique identifier  */
	u_int32_t   sc_unit;    /* Developer private unit number */
	u_int32_t   sc_reserved[5];
};

每种sockaddr_* 的头部结构都是sockaddr,其中第一个字节即sa_len表示该结构的长度,第二个字节sa_family表示地址类型。内核使用struct sockaddr*指针类型时,需要根据sa_family将其转换成struct sockaddr_in6*struct sockaddr_in*等具体类型。可以看到,当内核处理由用户态提交的sockaddr数据时,如果对sa_family或者sa_len检查不严格时,就可能导致安全漏洞。尤其是sa_len,描述了数据长度,如果检查不当,就可能引起内存越界访问等问题。

0x01 漏洞介绍

近些年xnu中陆续披露了一些与sockaddr相关的安全漏洞,其中最为著名的,是Google Project 0团队Ian Beer在mptcp模块中发现的一个漏洞。这里,我们先详细介绍一下这个漏洞。理解这个漏洞的成因对挖掘新漏洞很有帮助。

漏洞回顾

Ian Beer发现的mptcp漏洞位于mptcp_usr_connectx函数中。mptcp_usr_connectx在处理用户态传入的sockaddr数据时,认为其类型只可能是AF_INET或者AF_INET6mptcp_usr_connectx严格检查了sockaddr是这两种类型时的sa_len字段。然而,这里的逻辑缺陷是,一旦传入不是AF_INET或者AF_INET6类型的sockaddrsa_len字段就没有检查。mptcp_usr_connectx使用sa_len字段调用memcpy时发生堆溢出。更详细的漏洞分析见链接:Issue 1558: XNU kernel heap overflow due to bad bounds checking in MPTCP

// verify sa_len for AF_INET:

  if (dst->sa_family == AF_INET &&
      dst->sa_len != sizeof(mpte->__mpte_dst_v4)) {
    mptcplog((LOG_ERR, "%s IPv4 dst len %u\n", __func__,
        dst->sa_len),
       MPTCP_SOCKET_DBG, MPTCP_LOGLVL_ERR);
    error = EINVAL;
    goto out;
  }

// verify sa_len for AF_INET6:

  if (dst->sa_family == AF_INET6 &&
      dst->sa_len != sizeof(mpte->__mpte_dst_v6)) {
    mptcplog((LOG_ERR, "%s IPv6 dst len %u\n", __func__,
        dst->sa_len),
       MPTCP_SOCKET_DBG, MPTCP_LOGLVL_ERR);
    error = EINVAL;
    goto out;
  }

// code doesn't bail if sa_family was neither AF_INET nor AF_INET6

  if (!(mpte->mpte_flags & MPTE_SVCTYPE_CHECKED)) {
    if (mptcp_entitlement_check(mp_so) < 0) {
      error = EPERM;
      goto out;
    }

    mpte->mpte_flags |= MPTE_SVCTYPE_CHECKED;
  }

// memcpy with sa_len up to 255:

  if ((mp_so->so_state & (SS_ISCONNECTED|SS_ISCONNECTING)) == 0) {
    memcpy(&mpte->mpte_dst, dst, dst->sa_len); <== 当sa_family为非AF_INET和AF_INET6时,没有对sa_len进行长度校验,所以sa_len可以最大为0xff,导致堆溢出。
  }

Ian Beer对这个漏洞的利用技巧也非常精彩。我们暂不关心漏洞的利用过程,再分析一下这个漏洞特征。可以看到在这个漏洞代码里,开发者虽然有意识的检查了sockaddr数据,但只检查了特定类型和相应长度的匹配关系;这导致如果传入的sockaddr数据是别的类型,其sa_len字段并没有有效检查。

漏洞1 ==>inctl_ifdstaddr

看过Ian Beer这个漏洞后,我们开始思考,xnu中是否还存在类似的问题:对传入的sockaddr仅做了部分类型和长度匹配检查,对其它类型的sockaddr未作检查而继续使用?

带着这个问题,我们继续审计xnu代码。很快我们就在ioctl的处理函数(in_control函数)里发现了一个新的信息泄漏漏洞。

该漏洞原因是inctl_ifdstaddr函数在处理SIOCSIFDSTADDR命令时,只处理了family为AF_INET时的sin_len,因此当family为其他值(比如AF_INET6)的时候,sin_len未被检查,可以为任意值。

如下所示,ifr指向用户可控的数据,当inctl_ifdstaddr函数在处理SIOCSIFDSTADDR命令时,先将用户可控的结构体ifr全部拷贝到ia里,然后在a处,处理family为AF_INET的情况:将ia->ia_dstaddr.sin_len设置为sockaddr_in的结构体大小。

但是,当family为其他值,比如为AF_INET6时,inctl_ifdstaddr函数没有做任何处理,所以ia->ia_dstaddr.sin_len就仍是从ifr里面拷贝过来的用户控制的length,范围为0~0xff。

static __attribute__((noinline)) int
inctl_ifdstaddr(struct ifnet *ifp, struct in_ifaddr *ia, u_long cmd,
    struct ifreq *ifr){
  //...
	case SIOCSIFDSTADDR:            /* struct ifreq */
		VERIFY(ia != NULL);
		IFA_LOCK(&ia->ia_ifa);
		dstaddr = ia->ia_dstaddr;
		bcopy(&ifr->ifr_dstaddr, &ia->ia_dstaddr, sizeof(dstaddr));
		if (ia->ia_dstaddr.sin_family == AF_INET) {
			ia->ia_dstaddr.sin_len = sizeof(struct sockaddr_in); <== a:只在family为AF_INET时检查sin_len 
		}
		//...
}

到这里,我们可以在ia->ia_dstaddr填入一个非AF_INET类型的sockaddr并任意设定sin_len。接下来的问题是,这个ia->ia_dstaddr在哪里会被使用?

我们继续审计代码,在sysctl_iflist函数中找到了对ia->ia_dstaddr的使用。下面代码中,ifa->ifa_dstaddr就是inctl_ifdstaddr里设置的ia->ia_dstaddr。在b处这个sockaddr被存入到rti_info里,然后传入到rt_msg2函数中。

static int
sysctl_iflist(int af, struct walkarg *w)
{
  while ((ifa = ifa->ifa_link.tqe_next) != NULL) {
        //...
        info.rti_info[RTAX_IFA] = ifa->ifa_addr;
	info.rti_info[RTAX_NETMASK] = ifa->ifa_netmask;
	info.rti_info[RTAX_BRD] = ifa->ifa_dstaddr; <== b: 之前设置的sockaddr
//...
	len = rt_msg2(RTM_NEWADDR, &info, <== c:  
		caddr_t)cp, NULL, &cred);
          //...
    }

我们来看rt_msg2的实现。rt_msg2就循环遍历rtinfo数组,当遍历到RTAX_BRD时,sa就是ifa->ifa_dstaddr。那么如e处所示,dlen就是之前用户可控的length,最大可达到0xff。rt_msg2调用bcopy函数做内存复制时,发生内存越界读,最大可拷贝出255字节的数据,这些泄漏出来的数据里可能包含函数指针,导致内存泄漏。

static int
rt_msg2(int type, struct rt_addrinfo *rtinfo, caddr_t cp, struct walkarg *w,
    kauth_cred_t* credp){
  for (i = 0; i < RTAX_MAX; i++) {
    //...
  	if ((sa = rtinfo->rti_info[i]) == NULL) { <== d:当i遍历到RTAX_BRD时,sa就是ifa->ifa_dstaddr
			continue;
		}
    //...
    rtinfo->rti_addrs |= (1 << i);
		dlen = sa->sa_len; 	<== e: 当i遍历到RTAX_BRD时, dlen为用户可控。
		rlen = ROUNDUP32(dlen);
		if (cp) {
			bcopy((caddr_t)sa, cp, (size_t)dlen); <== f: cp最后会被拷贝到用户态
			if (dlen != rlen) {
				bzero(cp + dlen, rlen - dlen);
			}
			cp += rlen;
		}
		len += rlen;
  }
  //...
}

我们POC运行结果如下。越界读取函数指针后,即可计算kernel slide。

漏洞2 ==>flow_divert_is_sockaddr_valid

上面的信息泄漏不是孤例。很明显,开发者犯了mptcp里同样的错误。我们再把漏洞特征放宽一些,看看其它xnu模块中对sa_len字段的检查。

很快,在flow_divert_is_sockaddr_valid函数中,我们看到了下面的代码。

static boolean_t,
flow_divert_is_sockaddr_valid(struct sockaddr *addr)
{
	switch (addr->sa_family) {
	case AF_INET:
		if (addr->sa_len < sizeof(struct sockaddr_in)) { <==应该是!=
			return FALSE;
		}
		break;
#if INET6
	case AF_INET6:
		if (addr->sa_len < sizeof(struct sockaddr_in6)) {<==应该是!=
			return FALSE;
		}
		break;
#endif  /* INET6 */
	default:
		return FALSE;
	}
	return TRUE;
}

通过函数名字,不难推测flow_divert_is_sockaddr_valid就是用来验证sockaddr是否合法的。flow_divert_is_sockaddr_valid明确限定了sockaddr只能是AF_INET或者AF_INET6。然而,在长度检查中,flow_divert_is_sockaddr_valid犯了一个低级错误: flow_divert_is_sockaddr_valid函数只检查了addr->sa_len不要小于结构体的实际大小,但是却没考虑到sa_len可能大于结构体实际大小的情况。

因此,只要传入的sockaddr类型是AF_INET或者AF_INET6,攻击者就可以设置过长的sa_len,导致flow_divert后继使用sockaddr的时候发生内存越界访问。感兴趣的朋友可以尝试一下自行构造POC代码。

0x02 修复

针对第一个泄漏,Apple在最新的iOS 13.6版本中已经修复。在已经开源的xnu-6153.141.1中,我们可以对比发现补丁信息如下。

在上面的代码中,inctl_ifdstaddr函数在处理SIOCSIFDSTADDR命令时,强制把ia->ia_dstaddrfamilysin_len字段设置为AF_INET类型。

针对第二个漏洞,Apple在iOS 13.5中已经修复。Apple并没有直接更改函数flow_divert_is_sockaddr_valid, 而是在调用这个函数外层,增加了长度检查。

0x03 总结

这篇文章里,我们分享了我们如何在Ian Beer公布mptcp漏洞后,分析漏洞成因、总结漏洞特征、到根据漏洞特征挖掘新漏洞的过程。漏洞挖掘很考验研究者“举一反三”的能力。在大量代码中针对性的快速定位疑似漏洞代码会大大提高漏洞挖掘的效率。而从历史漏洞中总结分析,对定位疑似漏洞代码大有毗益。此外,
sockaddr一个如此简单的数据结构,但在大量的类型转换过程中,一旦类型和长度检查逻辑不完备,就可能导致更严重的安全问题。在我们分享的这两个漏洞之外,我相信也能找到其它相似问题。

Credit:漏洞由盘古实验室迟欣茹、王铁磊发现,提交Apple修复。