I’ve been working on tuning our network configuration between our primary and standby databases. This entry shows the steps I took.
The first thing to check is our bandwidth. I talked with our Network Admin and we have a DS-3 link between our primary data center at our corporate headquarters and our DR site 30 miles away. This normally means a bandwidth of 45Mbps. However, we also employ a WAN Accelerator to compress and optimize the network traffic. Our WAN Accelerator shows we are optimizing 70% of the traffic. This gives us an effective bandwidth of about 65Mbps (65Mbps * 70% = 45.5 Mbps).
The question now becomes, is this enough? To answer that question, I used the following query to determine the redo log generation each hour.
select inst1.hour,inst1.num_1,to_char(inst1.bytes_1,'999,999,999,999') as bytes_1,
inst2.num_2,to_char(inst2.bytes_2,'999,999,999,999') as bytes_2,
inst3.num_3,to_char(inst3.bytes_3,'999,999,999,999') as bytes_3
(select to_char(first_time,'YYYY-MM-DD HH24') as hour,count(*) as num_1, sum(blocks*block_size) as bytes_1
from gv$archived_log where inst_id=1 and thread#=1 and dest_id=1
group by to_char(first_time,'YYYY-MM-DD HH24')) inst1,
(select to_char(first_time,'YYYY-MM-DD HH24') as hour,count(*) as num_2, sum(blocks*block_size) as bytes_2
from gv$archived_log where inst_id=2 and thread#=2 and dest_id=1
group by to_char(first_time,'YYYY-MM-DD HH24')) inst2,
(select to_char(first_time,'YYYY-MM-DD HH24') as hour,count(*) as num_3, sum(blocks*block_size) as bytes_3
from gv$archived_log where inst_id=3 and thread#=3 and dest_id=1
group by to_char(first_time,'YYYY-MM-DD HH24')) inst3
where inst1.hour=inst2.hour and inst2.hour=inst3.hour
order by 1;
We have a 3-node RAC primary database. The above will get the number of logs and redo bytes generated each hour of the data from V$ARCHIVED_LOG. I’m sure there is a more elegant way of writing this query, but this is how I did it. I exported the results to an Excel spreadsheet which let me convert the bytes per hour to megabits per second. I then determined the one singular hour which had the highest Mbps rate. I determined that our max hour period of time generated 57.08Mbps of redo. Our bandwidth needs are sufficient.
The next thing to do was to compute our Bandwidth Delay Product (BDP). The BDP is a function of our bandwidth and the latency. The latency is about 10ms between the sites, but I will use 15ms to account for times of higher network congestion. As previously stated, the DS-3 link is 45Mbps.
BDP = 45Mbps * 15ms
= 45,000,000 * 0.015
= 675,000 megabits / 8 bits = 84,375 bytes
The calculations I have seen specify that we need a buffer size 3 times larger than the BDP. Which gives us the following calculation:
Buffer = 84,375 * 3 = 253,125
I rounded this up to 262,144 (which is 1024 * 256 = 2^10 * 2^8 = 2^18) . I like things to be nice powers of 2). This means I need to set the Oracle Net RECV_BUF_SIZE and SEND_BUF_SIZE to 262144. I also need to set the Session Data Unit (SDU) to be 32767.
Before we can begin, we need to look at the Linux kernel to verify the read and write max TCP buffer sizes:
[oracle@myhost ~]$ /sbin/sysctl -a | grep net.core | grep mem_max
net.core.wmem_max = 1048576
net.core.rmem_max = 4194304
The write max buffer size is about 1MB. The read max buffer size is about 4MB. I previously calculated the buffer size to be 262,144 bytes which is less than these maximums. If the max values were too low, then I would need to adjust the Linux kernel settings.
Now we need to check the kernel socket size for reads and writes.
[oracle@myhost ~]$ /sbin/sysctl -a | grep ipv4.tcp | grep mem
net.ipv4.tcp_wmem = 4096 16384 2067936
net.ipv4.tcp_rmem = 4096 87380 2067936
The three values are, in order, minimum, default, and maximum. We only need to verify that the maximum is larger than our computed value of 262,144 bytes, which they are.
Now I am on to the network configuration changes. As stated earlier, I need to set the SDU to 32767 and to handle the new TCP buffer sizes. This is done in my tnsnames.ora config file. First, I need to verify the service name I am using.
SQL> show parameter log_archive_dest_2
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2 string service="ress", LGWR ASYNC NOA
FFIRM delay=0 optional compres
sion=disable max_failure=0 max
_connections=1 reopen=300 db_u
From this, we can clearly see the service name being used. Now we need to modify the alias for this service name in our tnsnames.ora config file. It should read as follows:
(ADDRESS = (PROTOCOL = TCP)(HOST = dr-rac01-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = dr-rac02-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(SERVER = DEDICATED)
(SERVICE_NAME = ress)
Note that primary is a three-node RAC system. So I need to make the same change on the other nodes as well. Also note that I need to make a corresponding change to the standby side in case I ever need to perform a switchover operation.
Now I need to update the listener.ora on each node of the primary and each node of the standby.
(SID_NAME = resp1))
Don’t forget to reload the listeners. Since this is RAC on both primary and standby, I just used srvctl to start/stop the listeners.
That’s all there is to it.