時(shí)間:2015-06-28 00:00:00 來(lái)源:IT貓撲網(wǎng) 作者:網(wǎng)管聯(lián)盟 我要評(píng)論(0)
做項(xiàng)目中遇到一個(gè)問(wèn)題。兩臺(tái)機(jī)器上用socket建立一個(gè)TCP連接,雙向通信,流量很大,這時(shí),通過(guò)在路由器上設(shè)置100%的丟包率將網(wǎng)絡(luò)斷開(kāi),這時(shí) socket當(dāng)然是發(fā)不了包,也收不了,出現(xiàn)大量的重傳,然后,取消路由器上的設(shè)置,恢復(fù)網(wǎng)絡(luò),結(jié)果,TCP連接client去往server的流量正常了,但server去往client卻不通,任憑你如何使勁的send,返回值就是0,而且errno為EAGAIN。
我用tcpdump看了一下此時(shí)的包數(shù)據(jù)(tc2是server,tc1是client):
12:08:21.020291 IP tc1.corp.com.42171 > tc2.corp.com.3003: S 4009389430:4009389430(0) win 5840
12:08:21.020571 IP tc2.corp.com.3003 > tc1.corp.com.42171: R 0:0(0) ack 4009389431 win 0
12:08:38.934329 IP tc2.corp.com.3903 > tc1.corp.com.3904: P 2398055392:2398056153(761) ack 2538876742 win 724
12:08:38.934519 IP tc1.corp.com.3904 > tc2.corp.com.3903: . ack 2165 win 13756
12:08:39.958457 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1:763(762) ack 2165 win 13756
12:08:39.958485 IP tc2.corp.com.3903 > tc1.corp.com.3904: . ack 763 win 1448
12:08:39.958653 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 763:881(118) ack 2165 win 13756
12:08:39.958660 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 881:997(116) ack 2165 win 13756
12:08:39.958719 IP tc2.corp.com.3903 > tc1.corp.com.3904: . ack 997 win 1448
12:08:39.958890 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 997:1114(117) ack 2165 win 13756
12:08:39.958898 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1114:1232(118) ack 2165 win 13756
12:08:39.958903 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1232:1349(117) ack 2165 win 13756
12:08:39.958971 IP tc2.corp.com.3903 > tc1.corp.com.3904: . ack 1349 win 1448
12:08:39.959141 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1349:1466(117) ack 2165 win 13756
12:08:39.959149 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1466:1583(117) ack 2165 win 13756
12:08:39.959154 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1583:1700(117) ack 2165 win 13756
12:08:39.959222 IP tc2.corp.com.3903 > tc1.corp.com.3904: . ack 1700 win 1448
tc2不發(fā)自己的數(shù)據(jù),卻只是一味的ACK從tc1傳來(lái)的數(shù)據(jù),等上半個(gè)小時(shí),依然如此。它為什么不發(fā)呢?
最后發(fā)現(xiàn)是因?yàn)槲覀冊(cè)趕ocket上設(shè)了TCP_NODELAY。去掉這個(gè)設(shè)置,重啟程序,斷網(wǎng)恢復(fù)以后,TCP雙向正常工作。同樣用tcpdump看:
16:05:38.782427 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: P 0:887(887) ack 1 win 26064
16:05:38.782619 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 3783 win 25352
16:05:38.782634 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 3783:5231(1448) ack 1 win 26064
16:05:38.782637 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 5231:6679(1448) ack 1 win 26064
16:05:38.782890 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 5231 win 25352
16:05:38.782896 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 6679:8127(1448) ack 1 win 26064
16:05:38.782898 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 8127:9575(1448) ack 1 win 26064
16:05:38.782901 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 6679 win 25352
16:05:38.782904 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 9575:11023(1448) ack 1 win 26064
16:05:38.783183 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 8127 win 25352
16:05:38.783188 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 11023:12471(1448) ack 1 win 26064
16:05:38.783191 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 9575 win 25352
16:05:38.783193 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 12471:13919(1448) ack 1 win 26064
16:05:38.783196 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 11023 win 25352
16:05:38.783199 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 13919:15367(1448) ack 1 win 26064
16:05:38.783201 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 15367:16815(1448) ack 1 win 26064
16:05:38.783502 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 12471 win 25352
16:05:38.783506 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 16815:18263(1448) ack 1 win 26064
16:05:38.783509 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 13919 win 25352
16:05:38.783512 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 18263:19711(1448) ack 1 win 26064
16:05:38.783514 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 15367 win 25352
16:05:38.783517 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 19711:21159(1448) ack 1 win 26064
16:05:38.783519 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 16815 win 25352
tc2這次發(fā)自己的數(shù)據(jù)流了,tc1對(duì)其ACK,過(guò)了一段時(shí)間,tc1也開(kāi)始發(fā)數(shù)據(jù),最后雙向正常。
為什么帶了TCP_NODEALY的socket,在網(wǎng)絡(luò)好了以后恢復(fù)不了正常?
看看recv系統(tǒng)調(diào)用的實(shí)現(xiàn)(2.6.9內(nèi)核),一直追溯到tcp_recvmsg函數(shù):
[net/ipv4/tcp.c --> tcp_recvmsg]
813???? while (--iovlen >= 0) {
814?? int seglen = iov->iov_len;
815?? unsigned char __user *from = iov->iov_base;
816
817?? iov++;
818
819?? while (seglen > 0) {
820 int copy;
821
822 skb = sk->sk_write_queue.prev;
823
824 if (!sk->sk_send_head ||
825???? (copy = mss_now - skb->len) <= 0) {
826
#p#副標(biāo)題#e#
827 new_segment:
828???? /* Allocate new segment. If the interface is SG,
829????? * allocate skb fitting to single page.
830????? */
831???? if (!sk_stream_memory_free(sk))
832?? goto wait_for_sndbuf;
833
834???? skb = sk_stream_alloc_pskb(sk, select_size(sk, tp),
835? 0, sk->sk_allocation);
836???? if (!skb)
837?? goto wait_for_memory;
831行判斷sndbuf里還有沒(méi)有空間,如果沒(méi)有,跳到wait_for_sndbuf
[n
關(guān)鍵詞標(biāo)簽:tcp連接
相關(guān)閱讀
熱門(mén)文章 安裝紅帽子RedHat Linux9.0操作系統(tǒng)教程 Tomcat9.0如何安裝_Tomcat9.0環(huán)境變量配置方法 多種操作系統(tǒng)NTP客戶端配置 Linux操作系統(tǒng)修改IP
人氣排行 Linux下獲取CPUID、硬盤(pán)序列號(hào)與MAC地址 dmidecode命令查看內(nèi)存型號(hào) linux tc實(shí)現(xiàn)ip流量限制 安裝紅帽子RedHat Linux9.0操作系統(tǒng)教程 linux下解壓rar文件 lcx.exe、nc.exe、sc.exe入侵中的使用方法 Ubuntu linux 關(guān)機(jī)、重啟、注銷(xiāo) 命令 查看linux服務(wù)器硬盤(pán)IO讀寫(xiě)負(fù)載