Often asked, well documented but a summary is always good.
What are the conditions for the Observer to trigger a fast-start Failover and where do we have to place the observers?
As with 19c, you might have noticed that there are less white papers published and however a pdf is easy to read, it is difficult to maintain. The world is evolving rapidly and we want to be able to provide the best and most accurate information regarding best practices possible. That is why we decided to have these good habits in the Oracle Documentation as well! Great isn’t it? The advantage is, that we can update this quickly the moment we have new information available.
Recently we have added information about the observer placement too. Lets summarise this here too.
In an ideal state fast-start failover is deployed with the primary, standby, and observer, each within their own availability domain (AD) or data center; however, configurations that only use two availability domains, or even a single availability domain, must be supported. The following are observer placement recommendations for two use cases.
Deployment Configuration 1: 2 regions with two ADs in each region.
- Initial primary region has the primary database in AD1, and two high availability observers in AD2
- Initial standby region has the standby database in AD1, and two high availability observers (used after role change) in AD2
- For the observer, MAA recommends at least 2 observer targets in the same primary region but in different ADs
Deployment Configuration 2: 2 regions with only 1 AD in each region
- Initial primary regions have the primary database and two light weight servers to host observers
- Initial standby region has the standby database and two light weight servers to host observers (when there is a role change)
Observer high availability
You can register up to three observers to monitor a single Data Guard broker configuration. Each observer is identified by a name that you supply when you issue the
START OBSERVER command. You can also start the observers as a background process.
Only the primary observer can coordinate fast-start failover with Data Guard broker. All other registered observers are considered to be backup observers.
If the observer was not placed in the background then the observer is a continuously executing process that is created when the
START OBSERVER command is issued. Therefore, the command-line prompt on the observer computer does not return until you issue the
STOP OBSERVER command from another
DGMGRL session. To issue commands and interact with the broker configuration, you must connect using another
DGMGRL client session.
The following conditions can trigger a failover.
- Database failure where all database instances are down
- Data files taken offline because of I/O errors
- Both the Observer and the standby database lose their network connection to the production database, and the standby database confirms that it is in a synchronized state
- A user-configurable condition
This brings you in the following 4 possible situations:
Primary + Observer = WIN
If the Primary and Observer can see each other, then the primary stays the primary. In this case, if the standby is unavailable (network issue or standby is down), then the primary goes UNSYNCHRONIZED.
Standby + Observer = WIN
If the Observer and the Standby can see each other and neither can see the primary (and thus the primary cannot see the observer), then the primary will hang in LGWR and a failover will occur. If the primary gain connectivity with the observer, it shuts down because it it told that it is no longer the primary.
Primary + Standby = Win
If the primary has connectivity with the standby, then it remains the primary, even if the observer dies.
What if everything fails?
If all three cannot see each other, then the primary will hang in LGWR as it did above.
Fast-Start failover is a great solution for Database availability. It is easy to implement and the observer only has the footprint of an Oracle Admin Client installation. It will NOT trigger a failover when Data can be lost in Maximum Availability or Maximum Protection mode. This makes it also very safe to use.
As always, questions, remarks?
Find me on twitter @vanpupi