Jan
19

Agent Is Blocked

I recently upgraded a database from Oracle 10.2.0.3 to Oracle 11.2.0.2. Everything went well. Some time after the upgrade, Grid Control 11g sent me an alert with the following:

Agent is blocked. Blocked reason is: Agent is out-of-sync with repository. This most likely means that the agent was reinstalled or recovered. Please contact an EM administrator to unblock the agent by performing an agent resync from the console.

At first, I had no clue how to fix this, but then I discovered the cure. Sign on to Grid Control and then click on Targets -> All Targets. Find the Agent that is blocked on click on it. Then click no the Agent Resynchronization button as seen in the screenshot below.

Make sure “Unblock agent” is selected on the next screen and press Continue. After some time, the problem will be fixed.

Jan
03

Error Adding Standby

I have been trying to create a Standby database from a new Primary that was recently put into production. I like how Grid Control automates much of the work for me. So Enterprise Manager is my preferred method. I recently ran into a problem using the Add Standby Database wizard in my 11g Grid Control. The primary database is Oracle 11.2.0.2 in case anyone is interested.

I fire up Grid Control and select my new Primary database. Click on the Availability tab and then follow the links and wizard to add a new standby database. When I get to step 5, I get the error shown in the screen shot.

Add Standby Screenshot

After getting little help from my Service Request, I stumbled on the problem. I had a previous Standby database for this Primary and I had to remove and then readd the Standby. The problem was that the Primary was already part of a Data Guard Broker configuration and the metadata files already exist on disk. To solve the problem, I did the following:

1. dgmgrl
2. connect /
3. remove configuration;
4. exit.
5. sqlplus /nolog
6. alter system set dg_broker_start=false scope=both;
7. alter system set dg_broker_config_file1=’ ‘ scope=both;
8. alter system set dg_broker_config_file2=’ ‘ scope=both;
9. exit
10. cd /directory_to_dg_broker_files
11. rm dr1sid.dat dr2sid.dat

At this point, I was able to re-run the Add Standby Database wizard in Grid Control.

Dec
13

Brand New Production Database

Every once in a while, no matter which company I work for, I am asked to set up a new production database. I was working on this very task today when I started thinking about how much work it was to create a brand new database in that past, how much the DBCA handles for us today, and how much is still left to do.

Currently, we have a development and test database for our third party application. We will be rolling out the application to production by the end of the week so I was tasked with setting up a production version of this database. The production database server is a 3-node RAC cluster which has already been set up for me because we are currently running two other databases on the cluster. So this saves me the step of installing and configuring Grid Infrastructure and the RDBMS software. But as I started setting up the database, I got to thinking about how much work I still had left to do. And since we rarely set up brand new production databases, some of these tasks are not as easily remembered as others. Below are the steps I went through today to get the production database up and running.

1. Using the dev/test databases as my guide, I determined my memory and disk storage requirements.
2. I verified the production RAC cluster had enough memory to support the new database instances.
3. I worked with my Storage Admin to get the necessary disk storage mounted on the cluster.
4. I then fired up the DBCA to create the brand new database. I walked through the wizard and filled in appropriate values and then let DBCA do its magic.
5. I really do not like how the DBCA lets me create/allocate redo logs so after the database was created, I created my own redo logs groups (multiplexed of course) and dropped the redo log groups the DBCA created for me.
6. I can never figure out how to add a 3rd control file in the DBCA. So after the database is created, I shut it down, make a 3rd copy of the control file, update the SPFILE with the fact that there are now 3 control files and start the database.
7. The DBCA put my password file and spfile in locations that are not optimal for me. So I moved them. In $ORACLE_HOME/dbs I created softlinks pointing to the new locations. Then I used srvctl to update the spfile location in CRS.
8. I’ve never once used the DBCA to set up archivelog mode. So I always skip that part of the DBCA. In addition, I like the idea of not archiving my redo logs when the DBCA is creating the database to speed up that process. So at this point, I set up archive logging for the database.
9. The database will be used with a Standby and I like to ensure I have a log switch at least once per hour, so I set ARCHIVE_LAG_TARGET to 3600.

At this point, the bare bones database is set up and ready for use. Now its time to get the database read for our application.

10. I set up any required tablespaces for the application.
11. I set up any required users for the application.
12. Changed the database’s default tablespace to one of the ones I created above. Then dropped the USERS tablespace.
13. Since this is a RAC database, we need to set up the service for the application to connect.

14. Now that the database is ready for the application, we need to set up the Standby database. This was easily done by using the Add Standby Database wizard in Grid Control.
15. Our Standby database is on a 2-node RAC cluster. The Add Standby Database wizard creates a single-instance database so the Convert to Cluster Database wizard was run in Grid Control to make the Standby a RAC database.

Finally, the last step was to ensure any maintenance tasks were extended to the new database. For example, cron jobs to delete old log files needed to be modified for the new instance.

Wow! That’s a lot of work to set up an initial database in our production environment. As I said in the beginning, the DBCA does a lot of work for us now. And Grid Control automates a lot of the Standby creation work as well. But there is still a lot of steps involved.

Dec
12

Wait Event: asynch descriptor resize

In my recently upgraded production database, I am seeing a number of SQL statements that are now experiencing high waits on the “asynch descriptor resize” event. I recently upgraded from 11.1.0.7 to 11.2.0.2 and SQL statements that never waited on this event are now getting caught.

Oracle 11.2 slightly changed the way the database and the OS kernel performed Asynchronous I/O calls. What is happening is that there a number of asynchronous I/O descriptors to be able to handle the async I/O calls. When the number of asynch I/O calls increases, the number of descriptors increases as well. When the number of asynch I/O calls decreases, the number of descriptors is decreased in a similar fashion.

Before Oracle can increase the number of descriptors, it must wait for all processes that are currently performing async I/O to complete their I/O calls. This terrible action really kills the “asynch” part of the I/O! Once all processes have completed their ansych I/O calls, Oracle can then modify the descriptors up or down depending on workload.

If your process just got done completing its asynch I/O, before it can make another asynch I/O call, it must wait for Oracle to modify the number of descriptors. As such, you are waiting for the “asynch descriptor resize” wait event.

This appears to be Bug 9829397 and you can download a patch for it from Metalink. This issue is fixed in 11.2.0.3 so it only appears in 11.2.0.1 and 11.2.0.2. One workaround is to disable Asynch I/O, but to me that workaround is highly undesirable. One can also reduce the occurences of this wait event by reducing their direct I/O and tuning their SQL statements.

Nov
22

HUD in a Contact

As a contact lens wearer, I like this idea:

http://www.dailymail.co.uk/sciencetech/article-2064543/Computerised-contact-lens-date-news-texts.html

It would be really cool to get a Heads Up Display right in my eyeball!

Nov
22

Background Processes

I swear that as I get further and further along in my Oracle career, I have a harder and harder time keeping up with Oracle’s background processes. I’d like to chalk this up to getting older, but after digging into it, I’m pretty sure its just Oracle’s explosion of background processes in 11g that is causing me fits.

Earlier today, I was watching a production database performance tab in Grid Control. My 3-node RAC cluster was experiencing high CPU so I drilled down and noticed that the majority of the CPU cycles were from a background process named NSA2. What?!?!?! What is NSA2? I’m pretty sure this has nothing to do with national security.

In doing a Google search of this background process, I came across a nifty query to give you a quick description of each Oracle background process. You will need to run this as SYS since it queries X$ tables:

column EXTERNAL_NAME format a13
column INTERNAL_NAME format a13
set lin 120
SELECT x$ksbdd.ksbddidn AS external_name,
x$ksmfsv.ksmfsnam AS internal_name, x$ksbdd.ksbdddsc AS description
FROM x$ksbdd, x$ksbdp, x$ksmfsv
WHERE x$ksbdd.indx = x$ksbdp.indx AND x$ksbdp.addr = x$ksmfsv.ksmfsadr
ORDER BY 1
;

In my 11.2.0.2 database, this query returned 296 rows! No wonder I can’t keep up. I remember when I could list all of the background processes off the top of my head (SMON, PMON, ARCH, DBWR, LGWR, CKPT, and RECO). Each new Oracle version increases the number of background processes, which is probably a good thing. But it makes it difficult for the DBA to diagnose problems sometimes.

So what is NSA2? Its a redo transport process responsible for shipping archived redo logs to my Standby database. Once I saw the description from the query above, it was obvious.

So take a read in your database to see all of the wonderful background processes you may meet on your journeys. One of my favorites in the list is TEST.

Nov
14

Earth View from Space

Ok…so this has nothing to do with databases, but it is really cool. I’ve spent many years working with satellite data and have seen lots of different views of the Earth from space. But this video is the coolest one I’ve ever seen.

Nov
11

Installing RAC for a Database with Datafiles

Not that long ago, I needed to upgrade Oracle Clusterware and RDBMS from 11.1.0.7 to 11.2.0.2. I have tons of experience with the Oracle RDBMS software, but I suspect that I am like most DBA’s in that my Clusterware experience is not at the same level as my RDBMS experience. So while I had no fears about upgrading the RDBMS software, I approached the Clusterware upgrade with some degree of nervousness. Admittedly, this uneasiness was born out of a lower degree of experience with the product. So before attempting in production, I tested, tested, and tested as much as I could.
The upgrade of Clusterware went smoothly thanks to lots of planning and testing.

After the upgrade, I was reflecting on what went right and what went wrong, and what I could do differently in the future. It occurred to me that I could have totally screwed up the Clusterware upgrade, blown away the Clusterware software, and still been able to install Clusterware from scratch and been able to get my RAC database up and running. This post will show the steps I took in a test environment to do this task. So if you find yourself with a totally screwed up Clusterware upgrade or install, there is no fear because so long as you still have your database files, you will be able to get things up and running. If your Clusterware upgrade failed, you will have to fix the root cause of the failure before proceeding. So if you had a bad cluster interconnect configuration or invalid storage for your voting disks, you will need to address those issues and then you can follow these steps.

I have a valid database with all of my datafiles in shared storage accessible by all nodes in the cluster. My test has two nodes for the cluster. It is also assumed that the database was shut down cleanly. This paper assumes the database name is ORCL.

Note: If your database you are adding to RAC was already part of RAC, you can skip steps 9-11 below. Steps 9-11 are required if you are also converting the database from a single-instance database to an Oracle RAC database while adding it to Clusterware.

Steps
1. Take a cold backup of your database datafiles. In case something happens, you can always restore from the backup by simply copying the files from the media.
2. Make sure the nodes are cleaned up from the Clusterware installation/upgrade. Refer to Metalink Note 239998.1 for detailed instructions if needed.
3. Install Oracle Clusterware correctly. You should have fixed the problems that caused this step to fail in the first place.
a. Verify the cluster is up and running.
i. $CRS_HOME/bin/crsctl status server
1. All nodes in the cluster should be ONLINE.
4. Install Oracle RDBMS on the cluster.
a. The OUI should detect the Clusterware and offer to perform a RAC installation.
5. Fire up the DBCA to create a shell database. We’ll use the DBCA to set up everything in CRS for us.
a. Choose the options to create a RAC database.
b. Use the same name as your existing database.
c. Create the shell database on all nodes in the cluster.
d. Use a common location for the database files, but do not overwrite your existing datafiles.
6. Shutdown the shell database
a. srvctl stop database –d orcl
7. Set up your SPFILE
a. If you have your old database’s SPFILE:
i. Place the file on shared disk.
ii. Update $ORACLE_HOME/dbs/initorcl.ora to point to the correct SPFILE location. Do this on all nodes
b. If you do not have a SPFILE, create a PFILE from scratch and then use the CREATE SPFILE FROM PFILE command to create a SPFILE.
i. Make sure the CONTROL_FILES parameter points to the correct location, not the shell database’s control files.
c. (Optional) If you have a password file, copy it to shared storage and then update the softlink in $ORACLE_HOME/dbs (on all nodes) to point to the password file. If you do not have a password file, you can create one later if desired.
8. Update CRS with SPFILE location (as root)
a. cd $CRS_HOME/bin
b. ./srvctl config database –d orcl
i. The output should tell you the SPFILE is in the location of the shell database
c. ./srvctl modify database –d orcl –p /correct_path/spfileorcl.ora
i. Do step 8b to verify change has taken place.
ii. Verify the change on the other node as well. You should only need to run this modify command once for all nodes.
9. Modify the SPFILE’s contents.
a. sqlplus /nolog
b. connect / as sysdba
c. create pfile=’/home/oracle/pfile.txt’ from spfile=’/path_to_spfile/spfileorcl.ora’;
d. Open the pfile in a text editor and make the following changes:
i. Add:
1. *.cluster_database=true
2. *.cluster_database_instances=2
3. orcl1.instance_number=1
4. orcl2.instance_number=2
5. orcl1.thread=1
6. orcl2.thread=2
ii. Change the *.undo_tablespace parameter to orcl1.undo_tablespace
e. Use the pfile to change the spfile
i. create spfile=’/path_to_spfile/spfileorcl.ora’ from pfile=’/home/oracle/pfile.txt’;
10. Create a new UNDO tablespace for the other instance.
a. export ORACLE_SID=orcl1
b. sqlplus / as sysdba
c. startup
d. create undo tablespace undotbs2 datafile ‘/path/undotbs02.dbf’ size 30m;
e. alter system set undo_tablespace=undotbs2 scope=spfile sid=’orcl2′;
11. Create redo logs for the new thread
a. alter database add logfile thread 2 group 10 ‘/path/redo10.log’ size 50m;
b. alter database add logfile thread 2 group 11 ‘/path/redo11.log’ size 50m;
c. alter database add logfile thread 2 group 12 ‘/path/redo12.log’ size 50m;
d. alter database enable public thread 2;
12. srvctl start database –d orcl

At this point, you have now installed Clusterware from scratch and added an existing database to it. Now that I have a proven method of taking any set of database files and adding them to any clusterware installation, I have lowered my nervousness levels when performing Clusterware upgrades on production systems. Even if I make a total mess of things, I can blow away Clusterware, reinstall it, and then add the database to Clusterware and have my RAC database up and running in no time.

Oct
12

11gR2 Compression Advisor = Evil

I recently upgraded to 11.2.0.2 from 11.1.0.7. Immediately after the upgrade, I noticed a spike in our redo generation. I also noticed tables with names like DBMS_TABCOMP_TEMP_UNCMP and DBMS_TABCOMP_TEMP_CMP. What are those tables doing in my production schemas and how did they get there? After digging, I was able to find out that these are created by the new 11gR2 Compression Advisor. During the nightly maintenance window, a scheduled job will look at your tables and take a subset of data and load into an uncompressed table (UNCMP). It will then compress your data (CMP) and determine the compression ratio based on your actual data. If the compression ratio is above a certain threshold, the advisor recommends the table for compression. After the Compression Advisor is done, the tables are deleted.

The problem is that the Compression Advisor generates a ton of redo! This has been noted as unpublished bug 8896202. More information can be found on Metalink Note 1284972.1 if you are interested.

I won’t be using the Compression Advisor so I’d like to turn off the nightly job. Unfortunately, the job is part of the Space Advisor and I cannot turn off the Compression Advisor on its own. Well the redo generation rate is simply too high and I guess I can live without the Space Advisor too, so off it goes:

SQL> exec dbms_auto_task_admin.disable(client_name=>'auto space advisor',operation=>null,window_name=>null);

PL/SQL procedure successfully completed.

I see that an enhancement request has been filed to decouple the Compression Advisor from the Space Advisor.

Oct
06

LOB RETENTION

In the past, one would use the PCTVERSION storage parameter for their LOB segments to reserve a percentage of storage space for read consistency of LOB segments. In Oracle 11g, you can now use the RETENTION parameter. The RETENTION parameter will use the UNDO_RETENTION parameter for determining how long to keep LOB data for read-consistency purposes. But be advised that it does not use the Undo tablespace! The LOB segment’s tablespace is used for read-consistency purposes.

When you change the UNDO_RETENTION parameter, the LOB segment’s retention value is not modified. If you query the RETENTION column of the DBA_LOBS view, you will notice the old UNDO_RETENTION value still remains after you have modified the UNDO_RETENTION parameter. To change the LOB segment’s RETENTION value to match the new UNDO_RETENTION value, do the following:

ALTER TABLE my_table MODIFY LOB (lob_column) (PCTVERSION 20);
ALTER TABLE my_table MODIFY LOB (lob_column) (RETENTION);

By momentarily changing the LOB to use PCTVERSION and then back to RETENTION, the new value of UNDO_RETENTION will be used. You need to do this for all LOB segments that you intend to modify.

Older posts «