[Glass] Hot standby race between startlogreceiver and SystemRepository continuousRestoreFromArchiveLogs:

Mon Jul 21 05:22:58 PDT 2014

If you restore a production backup on a hot standby machine, then launch
logreceiver, and then run "SystemRepository
continuousRestoreFromArchiveLogs:
", then the last call will fail depending on the timing.

Specifically, the logreceiver needs to first fetch the "Oldest tranlog
needed for restore" as per copydbf -i. If that one tranlog is not yet
downloaded, continuousRestoreFromArchiveLogs fails with "Restore from
transaction log failed, Unable to open log fileId ..."

To make matters worse, logreceiver seems to first need to catch up on a
bunch of tranlogs *before* the first one it really needs, adding extra
wait time.

For all tranlogs *after* that one, continuousRestoreFromArchiveLogs: is
happy to poll/wait until the tranlog arrives before it attempts to restore
it, so it complicates matters that we need to add a poll/wait loop to our
wrapper scripts to wait for the wanted tranlog to arrive before calling
continuousRestoreFromArchiveLogs:

Alternatively, the host standby documentation in the System Administration
Guide should at least mention that one needs to wait for the logreceiver
to catch up on its downloading before calling
continuousRestoreFromArchiveLogs: