[Glass] Out of space and now cannot start stone again :(

Mariano Martinez Peck marianopeck at gmail.com
Wed May 21 14:59:29 PDT 2014


On Wed, May 21, 2014 at 6:26 PM, James Foster <
james.foster at gemtalksystems.com> wrote:

> Thanks for the report on how it went. At least you learned something and
> maybe others will have gained a lesson as well.
>
>
Ohhh I remembered something....once you start the stone again with -R it
won't start and log will say something like:

    Opened a transaction log file for read_nolocks.
       filename = /XXX/Sites/XXX/gemstone/data/tranlog24.dbf
    Starting up without recovery and the log file 24 exists
      Remove log files and restart.
    Terminating stone.

So I needed to delete the tranlog file and start again. And there it worked.

Thanks!




> James
>
>
> On May 21, 2014, at 1:05 PM, Mariano Martinez Peck <marianopeck at gmail.com>
> wrote:
>
>
>
>
> On Wed, May 21, 2014 at 3:47 PM, James Foster <
> james.foster at gemtalksystems.com> wrote:
>
>> Hi Mariano,
>>
>> When a transaction log full situation happens the stone will still be
>> running. You should be able to make room (typically by moving existing
>> files elsewhere and then deleting), and then the system will recognize that
>> there is space available and resume operations. Alternatively you can log
>> in as DataCurator and add another transaction log directory (presumably on
>> another disk) and the system will resume operations. See the System
>> Administration Guide (SAG), section 8.4, page 190 for details on these
>> options.
>>
>>
> Hi James, thanks for this great answer!!
>
> Ohh wow....I should have clearly not rebooted the machine right ahead to
> add more hard disk space. Next time: either remove some files/directories
> so that to resume operations and then do a clean shutdown. Then fix the low
> space for real and then restore operations.
>
>
>
>> It is best to try to keep the stone running so that you do not lose data.
>>
>
> OK...didn't know. I learned today.
>
>
>> If you stop the stone and it was not able to shut down cleanly, then it
>> will attempt to replay transaction logs when it next starts. If a
>> transaction log is corrupt (e.g., has an incomplete record), then the
>> replay will fail (as shown in your stone log).
>>
>>
> Indeed, that's exactly what happened in my case.
>
>
>> If you start the stone with the -N option (SAG p. 326), then the database
>> will be consistent as of the last checkpoint, which is generally not the
>> most recent transaction. Checkpoints can be triggered manually (Stone
>> class>>#’startCheckpointAsync’ and Stone class>>#’startCheckpointSync’) and
>> they will happen automatically based on the STN_CHECKPOINT_INTERVAL
>> configuration (SAG p. 289).
>>
>>
> Thanks, I have just checked and yes, it is 5 minutes (300 seconds).
>
>
>> Each checkpoint is recorded in the transaction log and you can get
>> information using the copydbf command (SAG pp. 310-11). In particular, the
>> -I (capital eye) option lists all checkpoint times. Tools are available for
>> doing further analysis of the transaction log (SAG Chapter F), so you can
>> see something about the transactions that happened after the last
>> checkpoint.
>>
>>
> OK, good to know. In my case it was not that critical (I could lost 5
> minutes of edits). But good to know. I have just tried, and yes, I can see
> the checkpoint lists.
>
>
>> It is possible that using copydbf (SAG p. 310) you can copy all but the
>> last (invalid) transaction record. (I took a valid transaction log that had
>> 100899 records, used ‘truncate’ to shorten it, used copydbf, and the result
>> had 100898 records.) To try this rename the existing (bad) log file, and
>> copydbf to the former name and try to restart the stone. If that works then
>> everything except the last transaction is fine.
>>
>>
> OK, I tried. I made the new one and yes, it was some bytes less than the
> "broken" one, but my stone would not launch anyway..saying the same error.
> So I proceed with the -N option. But it was a good shot. Don't know why it
> didn't work.
>
>
>> Otherwise, by starting with the -N option you will have lost transactions
>> since the last checkpoint (typically every five minutes). If you want us to
>> try to recover more then it might be time for a consulting engagement. ;-)
>>
>>
> :) I appreciate your detailed answer. Starting with a -N was enough for
> the moment. Next time: do not reboot nor kill stone. First try to shut it
> down cleanly.
>
> Thanks james!
>
>
>> Let us know how it goes,
>>
>> James Foster
>>
>>
>> On May 21, 2014, at 11:02 AM, Mariano Martinez Peck <
>> marianopeck at gmail.com> wrote:
>>
>> Hi guys,
>>
>> My server got out of disk space and gemstone had the following error
>> while trying to connect via topaz:
>>
>> *Login denied to other than SystemUser or DataCurator. All tranlog*
>> *directories are full. Stone process waiting for operator to archive*
>> *tranlogs or add more directories.,*
>>
>> OK, I make the virtual hard disk , expand partitions etc...now the OS
>> hard disk space looks with 400GB free. But when I try to start the stone
>> again, it fails to do it from the last translog.
>>
>> What I am supposed to do?  startstone -N ?   If true...what info do I
>> lost? what "info" is up to the last checkpoint?  In my case this is a
>> seaside app with GLASS transaction management. Would the lost be just the
>> last request?
>>
>>
>> Thanks in advance,
>>
>>
>>
>>
>> ========================================================================
>>     Now starting GemStone monitor.
>>
>> Write to /proc/2468/oom_score_adj failed with EACCES , linux user does
>> not have CAP_SYS_RESOURCE
>> No server process protection from OOM killer
>>
>>  _____________________________________________________________________________
>> |     SESSION CONFIGURATION: The maximum number of concurrent sessions is
>> 41. |
>>
>> |_____________________________________________________________________________|
>>
>>     Attaching the Shared Cache using Stone name: XXX
>>     Successfully started 1 free frame page servers.
>>
>>     -------------------------------------------------------
>>     Summary of Configured Transaction Logs
>>       Directory   0:
>>         configured name /XXX/Sites/XXX/gemstone/data
>>         expanded name /XXX/Sites/XXX/gemstone/data/
>>         configuredSize 1000 MB
>>       Directory   1:
>>         configured name /XXX/Sites/XXX/gemstone/data
>>         expanded name /XXX/Sites/XXX/gemstone/data/
>>         configuredSize 1000 MB
>>     -------------------------------------------------------
>>
>>     Extent #0
>>     -----------
>>     Filename = !#dbf!/XXX/Sites/XXX/gemstone/data/extent0.dbf
>>     Maximum size = NONE
>>     File size = 10586 Mbytes = 677504 pages
>>     Space available = 7927 Mbytes = 507353 pages
>>
>>     Totals
>>     ------
>>     Repository Size = 10586 Mbytes = 677504 pages
>>     Free Space = 7927 Mbytes = 507353 pages
>>     ---------------------------------------------------
>>     Extent 0 was not cleanly shutdown.
>>
>>
>>     Repository startup statistics:
>>         Pages Need Reclaiming =10
>>         Free Oops=133758598
>>         Oop Number High Water Mark=157752737
>>         Possible Dead Objects=12405000
>>         Dead Objects=0
>>         Epoch Transaction Count=0
>>         Epoch New Objects Union=0
>>         Epoch Written Objects Union=0
>>         Epoch DependencyMap Objects Union=0
>>
>>     Repository startup is from checkpoint = (fileId 13, blockId 388232)
>>
>>    SearchForMostRecentLog did not find any tranlogs
>>
>>  :: (wildcard) found in listening addresses, ignoring other addresses
>> created listening socket for ::, on :: port 43405
>>
>>
>>     Opened a transaction log file for read_nolocks.
>>        filename = /XXX/Sites/XXX/gemstone/data/tranlog13.dbf
>> EOF while reading log record.
>> EOF encountered while reading log record.    Unable to read log record
>> 13.388232 for current checkpoint
>>     Waiting for all tranlog writes to complete
>>
>>     Stone startup has failed.
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>> _______________________________________________
>> Glass mailing list
>> Glass at lists.gemtalksystems.com
>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>
>>
>>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
>


-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20140521/f73623d5/attachment.html>


More information about the Glass mailing list