LINUX DHCP (Dynamic Host Configuration Protocol) Failover 實作

在完成單台 DHCP server 的設定後,便是開始兩台 DHCP server redundancy 的實作囉,在 DHCP failover 中,有分成 Primary server 和 Secondry server,這裡要做的例子是:
第一台 Primary DHCP server (Host name: KHXDHCPS1, IP: 10.69.10.30)
第二台 Secondary DHCP server (Host name: KHXDHCPS2, IP: 10.69.10.31)
這兩台都是 RedHat EL ES4 Update6,dhcpd 的版本是 dhcp-3.0.1-59.EL4。

接下來的部分我將要讓 DHCP client 連上來時取得一個 10.69.100.1~10.69.100.240 之間
的 IP address (Netmask:255.255.255.0),Default gateway 為:10.69.100.254。

這裡我的 dhcpd 是跑在 eth0, 如果有需要跑在另一張網卡的話,可以自行去修改/etc/sysconfig/dhcpd 的內容:
例如要跑在eth1的話:
[root@KHXDHCPS1 ~]# cat /etc/sysconfig/dhcpd
# Command line options here
DHCPDARGS=eth1
當然也可以直接去修改 start 那一段:
start() {
# Start daemons.
echo -n $"Starting $prog: "

daemon /usr/sbin/dhcpd ${DHCPDARGS} 2>/dev/null

上面的第三行便是定義 dhcpd daemon 啟動時的 option,其中 ${DHCPDARGS} 就是剛剛 /etc/sysconfig/dhcpd 裡面給的值,
其他還有一些可用的 option 如下:
-f — 把 daemon 跑在 fg。這在測試時最常用。
-d — 把 DCHP daemon 記錄到標準錯誤描述器中。也是在測試時最常用。如果沒指定將被寫入 /var/log/messages。
-cf filename — 指定設定檔的位置。Default 是 /etc/dhcpd.conf。
-lf filename — 指定IP 租用記錄的位置。如果檔案已存在,在 DHCP service 每次啟動時使用同一個文件是很重要的。
強烈建議你只在無關緊要的機器上為調試目的才使用該選項。Default 的位置是 /var/lib/dhcp/dhcpd.leases.
-q — 在啟動該 daemon 時,不顯示整篇版權信息。

像上一篇提到的 /usr/sbin/dhcpd -d -f eth0 -lf /var/lib/dhcp/dhcpd.leases 就是我用來測試的。

以下便是實作開始:
首先就先編輯 Primary DHCP server 的設定檔:
這是第一台 DHCP server 的 dhcpd.conf 的內容:
[root@KHXDHCPS1 ~]# cat /etc/dhcpd.conf
ddns-update-style none;
ignore client-updates;
#ignore unknown-clients;

authoritative;
failover peer "dhcp-failover" {
primary;
address 10.69.10.30;
port 690;
peer address 10.69.10.31;
peer port 691;
max-response-delay 30;
max-unacked-updates 10;
load balance max seconds 3;
mclt 1800;
split 128;
}

subnet 10.69.0.0 netmask 255.255.0.0 {
option routers 10.69.100.254;
option subnet-mask 255.255.255.0;
option mobile-ip-home-agent 10.69.10.35;
option domain-name-servers 10.69.10.22;
default-lease-time 21600;
max-lease-time 43200;
pool {
failover peer "dhcp-failover";
range 10.69.100.1 10.69.100.240;
deny dynamic bootp clients;
}
}
這是第二台 DHCP server 的 dhcpd.conf 的內容:
[root@KHXDHCPS2 ~]# cat /etc/dhcpd.conf
ddns-update-style none;
ignore client-updates;
#ignore unknown-clients;

authoritative;
failover peer "dhcp-failover" {
secondary;
address 10.69.10.31;
port 691;
peer address 10.69.10.30;
peer port 690;
max-response-delay 30;
max-unacked-updates 10;
load balance max seconds 3;
}

subnet 10.69.0.0 netmask 255.255.0.0 {
option routers 10.69.100.254;
option subnet-mask 255.255.255.0;
option mobile-ip-home-agent 10.69.10.35;
option domain-name-servers 10.69.10.22;
default-lease-time 21600;
max-lease-time 43200;
pool {
failover peer "dhcp-failover";
range 10.69.100.1 10.69.100.240;
deny dynamic bootp clients;
}
}
這裡有幾個重點要注意一下:
1. 這一行一定要有,ddns-update-style 動態更新 DNS 資料,設定方式有三種:
ddns-update-style ad-hoc
ddns-update-style interim
ddns-update-style none
2. 關於 "deny dynamic bootp clients;" 這一行,因為 failover 不支援 bootp 用戶端,必須拒絕才行,
所以如果你想搞個 Boot Server 帶 DHCP failover 的話,恐怕要失望了...
3. failover peer "dhcp-failover" 這是定義一下 failover 的名字
4. primary; 是指定這台 DHCP server 為 Master server。
5. secondary; 是指定這台 DHCP server 為 Slave server。
6. address 10.69.10.30 監聽 failover 訊息的 IP address。
7. port 690; 與 peer port 691; 監聽 failover 訊息的 TCP port。
8. peer address 10.69.10.31; 指定 Slave server 的 IP address。
9. max-response-delay 30; 同步信息最大延遲時間。
10. max-unacked-updates 10; 在收到對端 BNDACK 訊息之前最大可發送 BNDUPD 訊息的數量。
11. mclt 1800; 節點在互相通知之前更新一個租約的時間。
12. split 128; 固定值,必須是 128;這個值和 mclt 都只需設定在 primary 的設定檔上就好。
13. option mobile-ip-home-agent 10.69.10.35; 這是我測試環境裡給 WiMAX 用的 Home Agent 的 IP address。

接下來是驗證的課程囉:
把 dhcpd service 先打開然後到兩台 DHCP server 上去看一下 /var/log/messages 的內容吧:
[root@KHXDHCPS1 ~]# service dhcpd restart
Shutting down dhcpd: [ OK ]
Starting dhcpd: [ OK ]
這是第一台 DHCP server 的 dhcpd 啟動訊息:
[root@KHXDHCPS1 ~]# tail -f /var/log/messages
Jul 10 10:38:14 KHXDHCPS1 dhcpd: dhcpd shutdown succeeded
Jul 10 10:38:14 KHXDHCPS1 dhcpd: dhcpd shutdown succeeded
Jul 10 10:59:54 KHXDHCPS1 sshd(pam_unix)[29885]: session opened for user root by (uid=0)


Jul 10 11:02:29 KHXDHCPS1 dhcpd: Internet Systems Consortium DHCP Server V3.0.1
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Copyright 2004 Internet Systems Consortium.
Jul 10 11:02:29 KHXDHCPS1 dhcpd: All rights reserved.
Jul 10 11:02:29 KHXDHCPS1 dhcpd: For info, please visit http://www.isc.org/sw/dhcp/
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Internet Systems Consortium DHCP Server V3.0.1
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Copyright 2004 Internet Systems Consortium.
Jul 10 11:02:29 KHXDHCPS1 dhcpd: All rights reserved.
Jul 10 11:02:29 KHXDHCPS1 dhcpd: For info, please visit http://www.isc.org/sw/dhcp/
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Wrote 0 leases to leases file.
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Wrote 0 leases to leases file.
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Listening on LPF/eth0/00:1e:c9:ad:55:bf/10.69/16
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Sending on LPF/eth0/00:1e:c9:ad:55:bf/10.69/16
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Sending on Socket/fallback/fallback-net
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Listening on LPF/eth0/00:1e:c9:ad:55:bf/10.69/16
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Sending on LPF/eth0/00:1e:c9:ad:55:bf/10.69/16
Jul 10 11:02:29 KHXDHCPS1 dhcpd: Sending on Socket/fallback/fallback-net
Jul 10 11:02:29 KHXDHCPS1 dhcpd: failover peer dhcp-failover: I move from recover to startup
Jul 10 11:02:29 KHXDHCPS1 dhcpd: failover peer dhcp-failover: I move from recover to startup
Jul 10 11:02:29 KHXDHCPS1 dhcpd: dhcpd startup succeeded
Jul 10 11:02:29 KHXDHCPS1 dhcpd: dhcpd startup succeeded
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: peer moves from unknown-state to recover
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: requesting full update from peer
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: I move from startup to recover
Jul 10 11:02:31 KHXDHCPS1 dhcpd: Sent update request all message to dhcp-failover
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: peer moves from recover to recover
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: requesting full update from peer
Jul 10 11:02:31 KHXDHCPS1 dhcpd: Sent update request all message to dhcp-failover
Jul 10 11:02:31 KHXDHCPS1 dhcpd: Sent update done message to dhcp-failover
Jul 10 11:02:31 KHXDHCPS1 dhcpd: Update request all from dhcp-failover: nothing pending
Jul 10 11:02:31 KHXDHCPS1 dhcpd: Sent update done message to dhcp-failover
Jul 10 11:02:31 KHXDHCPS1 dhcpd: Update request all from dhcp-failover: nothing pending
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: peer update completed.
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: I move from recover to recover-done
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: peer update completed.
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: peer moves from recover to recover-done
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: I move from recover-done to normal
Jul 10 11:02:31 KHXDHCPS1 dhcpd: failover peer dhcp-failover: peer moves from recover-done to normal
Jul 10 11:02:31 KHXDHCPS1 dhcpd: pool 9d0e008 10.69/16 total 240 free 240 backup 0 lts -120
Jul 10 11:02:31 KHXDHCPS1 dhcpd: pool 9d0e008 10.69/16 total 240 free 240 backup 0 lts 120
這是第二台 DHCP server 的 dhcpd 啟動訊息:
[root@KHXDHCPS2 ~]# tail -f /var/log/messages
Jul 10 10:38:08 KHXDHCPS2 dhcpd: dhcpd shutdown succeeded
Jul 10 10:38:08 KHXDHCPS2 dhcpd: dhcpd shutdown succeeded
Jul 10 10:59:57 KHXDHCPS2 sshd(pam_unix)[31706]: session opened for user root by (uid=0)


Jul 10 11:02:30 KHXDHCPS2 dhcpd: Internet Systems Consortium DHCP Server V3.0.1
Jul 10 11:02:30 KHXDHCPS2 dhcpd: Copyright 2004 Internet Systems Consortium.
Jul 10 11:02:30 KHXDHCPS2 dhcpd: All rights reserved.
Jul 10 11:02:30 KHXDHCPS2 dhcpd: For info, please visit http://www.isc.org/sw/dhcp/
Jul 10 11:02:30 KHXDHCPS2 dhcpd: Internet Systems Consortium DHCP Server V3.0.1
Jul 10 11:02:30 KHXDHCPS2 dhcpd: Copyright 2004 Internet Systems Consortium.
Jul 10 11:02:30 KHXDHCPS2 dhcpd: All rights reserved.
Jul 10 11:02:30 KHXDHCPS2 dhcpd: For info, please visit http://www.isc.org/sw/dhcp/
Jul 10 11:02:30 KHXDHCPS2 dhcpd: Wrote 0 leases to leases file.
Jul 10 11:02:30 KHXDHCPS2 dhcpd: Wrote 0 leases to leases file.
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Listening on LPF/eth0/00:1e:c9:ad:55:a6/10.69/16
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Sending on LPF/eth0/00:1e:c9:ad:55:a6/10.69/16
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Sending on Socket/fallback/fallback-net
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Listening on LPF/eth0/00:1e:c9:ad:55:a6/10.69/16
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Sending on LPF/eth0/00:1e:c9:ad:55:a6/10.69/16
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Sending on Socket/fallback/fallback-net
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: I move from recover to startup
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: I move from recover to startup
Jul 10 11:02:31 KHXDHCPS2 dhcpd: dhcpd startup succeeded
Jul 10 11:02:31 KHXDHCPS2 dhcpd: dhcpd startup succeeded
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: peer moves from unknown-state to recover
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: requesting full update from peer
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: I move from startup to recover
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Sent update request all message to dhcp-failover
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: peer moves from recover to recover
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: requesting full update from peer
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Sent update request all message to dhcp-failover
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Sent update done message to dhcp-failover
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Update request all from dhcp-failover: nothing pending
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Sent update done message to dhcp-failover
Jul 10 11:02:31 KHXDHCPS2 dhcpd: Update request all from dhcp-failover: nothing pending
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: peer update completed.
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: I move from recover to recover-done
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: peer update completed.
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: peer moves from recover to recover-done
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: I move from recover-done to normal
Jul 10 11:02:31 KHXDHCPS2 dhcpd: failover peer dhcp-failover: peer moves from recover-done to normal
Jul 10 11:02:31 KHXDHCPS2 dhcpd: pool 9399ed0 10.69/16 total 240 free 240 backup 0 lts 120
明顯的 KHXDHCPS1 目前的確是 primary server,而 KHXDHCPS2 則是 secondary server。

接著把 Quanta Beceem BCS200 的 WiMAX 卡插上電腦開始嘗試著去取得 IP 吧,DHCP Client 向 DHCP Server 要求 IP 時主要的四個動作 ( DHCPDISCOVER , DHCPOFFER , DHCPREQUEST , DHCPACK ),如果能看到這四個動作,這就代表 Client 已經成功獲得 IP .而 /var/lib/dhcp/dhcp.lease 就會將此 IP 租用紀錄下來,所以這時會看到 KHXDHCPS1 (Master server) 上的 log 出現:
Jul 10 11:03:37 KHXDHCPS1 dhcpd: pool 9d0e008 10.69/16 total 240 free 120 backup 120 lts 0
Jul 10 11:03:37 KHXDHCPS1 dhcpd: DHCPDISCOVER from 00:17:c4:12:77:97 via 10.69.10.11
Jul 10 11:03:38 KHXDHCPS1 dhcpd: DHCPOFFER on 10.69.100.120 to 00:17:c4:12:77:97 (WiMAX-demoXX) via 10.69.10.11
Jul 10 11:03:38 KHXDHCPS1 dhcpd: DHCPREQUEST for 10.69.100.120 (10.69.10.30) from 00:17:c4:12:77:97 (WiMAX-demoXX) via 10.69.10.11
Jul 10 11:03:38 KHXDHCPS1 dhcpd: DHCPACK on 10.69.100.120 to 00:17:c4:12:77:97 (WiMAX-demoXX) via 10.69.10.11
而這時 KHXDHCPS2 (Slave server) 上的 log 則只出現:
Jul 10 11:03:37 KHXDHCPS2 dhcpd: pool 9399ed0 10.69/16 total 240 free 120 backup 120 lts 0
到 Client 上去看,果然有拿到正確的 IP address 了:
C:\Documents and Settings\Demo>ipconfig /all

Windows IP Configuration

Host Name . . . . . . . . . . . . : WiMAX-demoXX
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No

Ethernet adapter Wireless Network Connection:

Media State . . . . . . . . . . . : Media disconnected
Description . . . . . . . . . . . : Intel(R) PRO/Wireless 3945ABG Network Connection
Physical Address. . . . . . . . . : 00-18-DE-19-B5-92

Ethernet adapter Local Area Connection:

Media State . . . . . . . . . . . : Media disconnected
Description . . . . . . . . . . . : Intel(R) PRO/1000 PL Network Connection
Physical Address. . . . . . . . . : 00-15-58-30-80-C5

Ethernet adapter Local Area Connection 4:


Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Beceem Communications Inc. BCS200
Physical Address. . . . . . . . . : 00-17-C4-12-77-97
Dhcp Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IP Address. . . . . . . . . . . . : 10.69.100.120
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 10.69.100.254
DHCP Server . . . . . . . . . . . : 10.69.10.30
DNS Servers . . . . . . . . . . . : 10.69.10.22
Lease Obtained. . . . . . . . . . : Thursday, July 10, 2008 11:07:15 PM
Lease Expires . . . . . . . . . . : Thursday, July 10, 2008 12:37:15 PM
這時我們看一下 /var/lib/dhcp/dhcpd.leases 的內容多了剛剛 IP 的租用訊息:
lease 10.69.100.120 {
starts 4 2008/07/10 03:03:38;
ends 4 2008/07/10 03:33:38;
cltt 4 2008/07/10 03:03:38;
binding state active;
next binding state expired;
hardware ethernet 00:17:c4:12:77:97;
uid "\001\000\027\304\022w\227";
client-hostname "WiMAX-demoXX";
}
當然在 Slave 的機器上也會有一筆相同的紀錄在。

接著測試一下 IP Rlease 跟 IP Renew 都可以得到相同的 IP address:

這是 Master DHCP server 的訊息:
Jul 10 11:05:31 KHXDHCPS1 dhcpd: DHCPRELEASE of 10.69.100.120 from 00:17:c4:12:77:97 (WiMAX-demoXX) via 10.69.10.11 (found)
Jul 10 11:05:35 KHXDHCPS1 dhcpd: DHCPDISCOVER from 00:17:c4:12:77:97 via 10.69.10.11
Jul 10 11:05:36 KHXDHCPS1 dhcpd: DHCPOFFER on 10.69.100.120 to 00:17:c4:12:77:97 (WiMAX-demoXX) via 10.69.10.11
Jul 10 11:05:36 KHXDHCPS1 dhcpd: DHCPREQUEST for 10.69.100.120 (10.69.10.30) from 00:17:c4:12:77:97 (WiMAX-demoXX) via 10.69.10.11
Jul 10 11:05:36 KHXDHCPS1 dhcpd: DHCPACK on 10.69.100.120 to 00:17:c4:12:77:97 (WiMAX-demoXX) via 10.69.10.11
這是 Slave DHCP server 的訊息:
Jul 10 11:05:31 KHXDHCPS2 dhcpd: DHCPRELEASE of 10.69.100.120 from 00:17:c4:12:77:97 via 10.69.10.11 (found)
Jul 10 11:05:35 KHXDHCPS2 dhcpd: pool 9399ed0 10.69/16 total 240 free 120 backup 120 lts 0
這是 /var/lib/dhcp/dhcpd.leases 的部分:
lease 10.69.100.120 {
starts 4 2008/07/10 03:03:38;
ends 4 2008/07/10 03:05:31;
cltt 4 2008/07/10 03:03:38;
binding state released;
next binding state free;
hardware ethernet 00:17:c4:12:77:97;
uid "\001\000\027\304\022w\227";
client-hostname "WiMAX-demoXX";
}
lease 10.69.100.120 {
starts 4 2008/07/10 03:03:38;
ends 4 2008/07/10 03:05:31;
tstp 4 2008/07/10 03:05:31;
cltt 4 2008/07/10 03:03:38;
binding state free;
hardware ethernet 00:17:c4:12:77:97;
uid "\001\000\027\304\022w\227";
}
lease 10.69.100.120 {
starts 4 2008/07/10 03:05:36;
ends 4 2008/07/10 03:35:36;
cltt 4 2008/07/10 03:05:36;
binding state active;
next binding state expired;
hardware ethernet 00:17:c4:12:77:97;
uid "\001\000\027\304\022w\227";
client-hostname "WiMAX-demoXX";
好了,以上便是今天的實作報告~

相關的參考資料:
DHCP Failover/load balancing
Failover with ISC DHCP
1 Response
  1. Jim Huang Says:

    你好:
    冒昧請教個問題
    我已經架好dhcp, 想嘗試將它改為failover dhcp, 但只要加上peer "dhcp-failover"這一段,dhcp執行時就會出現問題,不知原因為何?

    作業系統:CentOS 5
    版本:dhcp-3.0.5-33.el5_9