[Glass] Out of space and now cannot start stone again :(
Mariano Martinez Peck
marianopeck at gmail.com
Wed May 21 13:05:33 PDT 2014
On Wed, May 21, 2014 at 3:47 PM, James Foster <
james.foster at gemtalksystems.com> wrote:
> Hi Mariano,
>
> When a transaction log full situation happens the stone will still be
> running. You should be able to make room (typically by moving existing
> files elsewhere and then deleting), and then the system will recognize that
> there is space available and resume operations. Alternatively you can log
> in as DataCurator and add another transaction log directory (presumably on
> another disk) and the system will resume operations. See the System
> Administration Guide (SAG), section 8.4, page 190 for details on these
> options.
>
>
Hi James, thanks for this great answer!!
Ohh wow....I should have clearly not rebooted the machine right ahead to
add more hard disk space. Next time: either remove some files/directories
so that to resume operations and then do a clean shutdown. Then fix the low
space for real and then restore operations.
> It is best to try to keep the stone running so that you do not lose data.
>
OK...didn't know. I learned today.
> If you stop the stone and it was not able to shut down cleanly, then it
> will attempt to replay transaction logs when it next starts. If a
> transaction log is corrupt (e.g., has an incomplete record), then the
> replay will fail (as shown in your stone log).
>
>
Indeed, that's exactly what happened in my case.
> If you start the stone with the -N option (SAG p. 326), then the database
> will be consistent as of the last checkpoint, which is generally not the
> most recent transaction. Checkpoints can be triggered manually (Stone
> class>>#’startCheckpointAsync’ and Stone class>>#’startCheckpointSync’) and
> they will happen automatically based on the STN_CHECKPOINT_INTERVAL
> configuration (SAG p. 289).
>
>
Thanks, I have just checked and yes, it is 5 minutes (300 seconds).
> Each checkpoint is recorded in the transaction log and you can get
> information using the copydbf command (SAG pp. 310-11). In particular, the
> -I (capital eye) option lists all checkpoint times. Tools are available for
> doing further analysis of the transaction log (SAG Chapter F), so you can
> see something about the transactions that happened after the last
> checkpoint.
>
>
OK, good to know. In my case it was not that critical (I could lost 5
minutes of edits). But good to know. I have just tried, and yes, I can see
the checkpoint lists.
> It is possible that using copydbf (SAG p. 310) you can copy all but the
> last (invalid) transaction record. (I took a valid transaction log that had
> 100899 records, used ‘truncate’ to shorten it, used copydbf, and the result
> had 100898 records.) To try this rename the existing (bad) log file, and
> copydbf to the former name and try to restart the stone. If that works then
> everything except the last transaction is fine.
>
>
OK, I tried. I made the new one and yes, it was some bytes less than the
"broken" one, but my stone would not launch anyway..saying the same error.
So I proceed with the -N option. But it was a good shot. Don't know why it
didn't work.
> Otherwise, by starting with the -N option you will have lost transactions
> since the last checkpoint (typically every five minutes). If you want us to
> try to recover more then it might be time for a consulting engagement. ;-)
>
>
:) I appreciate your detailed answer. Starting with a -N was enough for the
moment. Next time: do not reboot nor kill stone. First try to shut it down
cleanly.
Thanks james!
> Let us know how it goes,
>
> James Foster
>
>
> On May 21, 2014, at 11:02 AM, Mariano Martinez Peck <marianopeck at gmail.com>
> wrote:
>
> Hi guys,
>
> My server got out of disk space and gemstone had the following error while
> trying to connect via topaz:
>
> *Login denied to other than SystemUser or DataCurator. All tranlog*
> *directories are full. Stone process waiting for operator to archive*
> *tranlogs or add more directories.,*
>
> OK, I make the virtual hard disk , expand partitions etc...now the OS hard
> disk space looks with 400GB free. But when I try to start the stone again,
> it fails to do it from the last translog.
>
> What I am supposed to do? startstone -N ? If true...what info do I
> lost? what "info" is up to the last checkpoint? In my case this is a
> seaside app with GLASS transaction management. Would the lost be just the
> last request?
>
>
> Thanks in advance,
>
>
>
>
> ========================================================================
> Now starting GemStone monitor.
>
> Write to /proc/2468/oom_score_adj failed with EACCES , linux user does not
> have CAP_SYS_RESOURCE
> No server process protection from OOM killer
>
> _____________________________________________________________________________
> | SESSION CONFIGURATION: The maximum number of concurrent sessions is
> 41. |
>
> |_____________________________________________________________________________|
>
> Attaching the Shared Cache using Stone name: XXX
> Successfully started 1 free frame page servers.
>
> -------------------------------------------------------
> Summary of Configured Transaction Logs
> Directory 0:
> configured name /XXX/Sites/XXX/gemstone/data
> expanded name /XXX/Sites/XXX/gemstone/data/
> configuredSize 1000 MB
> Directory 1:
> configured name /XXX/Sites/XXX/gemstone/data
> expanded name /XXX/Sites/XXX/gemstone/data/
> configuredSize 1000 MB
> -------------------------------------------------------
>
> Extent #0
> -----------
> Filename = !#dbf!/XXX/Sites/XXX/gemstone/data/extent0.dbf
> Maximum size = NONE
> File size = 10586 Mbytes = 677504 pages
> Space available = 7927 Mbytes = 507353 pages
>
> Totals
> ------
> Repository Size = 10586 Mbytes = 677504 pages
> Free Space = 7927 Mbytes = 507353 pages
> ---------------------------------------------------
> Extent 0 was not cleanly shutdown.
>
>
> Repository startup statistics:
> Pages Need Reclaiming =10
> Free Oops=133758598
> Oop Number High Water Mark=157752737
> Possible Dead Objects=12405000
> Dead Objects=0
> Epoch Transaction Count=0
> Epoch New Objects Union=0
> Epoch Written Objects Union=0
> Epoch DependencyMap Objects Union=0
>
> Repository startup is from checkpoint = (fileId 13, blockId 388232)
>
> SearchForMostRecentLog did not find any tranlogs
>
> :: (wildcard) found in listening addresses, ignoring other addresses
> created listening socket for ::, on :: port 43405
>
>
> Opened a transaction log file for read_nolocks.
> filename = /XXX/Sites/XXX/gemstone/data/tranlog13.dbf
> EOF while reading log record.
> EOF encountered while reading log record. Unable to read log record
> 13.388232 for current checkpoint
> Waiting for all tranlog writes to complete
>
> Stone startup has failed.
>
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
> _______________________________________________
> Glass mailing list
> Glass at lists.gemtalksystems.com
> http://lists.gemtalksystems.com/mailman/listinfo/glass
>
>
>
--
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20140521/6987d29a/attachment.html>
More information about the Glass
mailing list