Starting Up Your System

Starting Up Your System

You should follow certain order guidelines when starting your GemFire system.

Start server-distributed systems before you start their client applications. In each distributed system, follow these guidelines for member startup:
  • Start locators first. See Running GemFire Locator Processes for examples of locator startup commands.
  • Start cache servers before the rest of your processes unless the implementation requires that other processes be started ahead of them. Running GemFire Server Processes for examples of server startup commands.
  • If your distributed system uses both persistent replicated and non-persistent replicated regions, you should start up all the persistent replicated members in parallel before starting the non-persistent regions. This way, persistent members will not delay their startup for other persistent members with later data.
  • If you are running producer processes and consumer or event listener processes, start the producers first. This ensures the consumers and listeners receive all notifications and updates.
  • If you are starting up your locators and peer members all at once, you can use the locator-wait-time property (in seconds) upon process start up. This timeout allows peers to wait for the locators to finish starting up before attempting to join the distributed system. If a process has been configured to wait for a locator to start, it will log an info-level message
    GemFire startup was unable to contact a locator. Waiting for one to start. Configured locators are frodo[12345],pippin[12345].
    The process will then sleep for a second and retry until it either connects or the number of seconds specified in locator-wait-time has elapsed. By default, locator-wait-time is set to zero meaning that a process that cannot connect to a locator upon startup will throw an exception.
Note: You can optionally override the default timeout period for shutting down individual processes. This override setting must be specified during member startup. See Option for System Member Shutdown Behavior for details.

Starting Up After Losing Data on Disk

This information pertains to catastrophic loss of GemFire disk store files. If you lose disk store files, your next startup may hang, waiting for the lost disk stores to come back online. If your system hangs at startup, use the gfsh command show missing-disk-store to list missing disk stores and, if needed, revoke missing disk stores so your system startup can complete. You must use the Disk Store ID to revoke a disk store. These are the two commands:
gfsh>show missing-disk-stores

           Disk Store ID             |   Host    |               Directory                                           
------------------------------------ | --------- | -------------------------------------
60399215-532b-406f-b81f-9b5bd8d1b55a | excalibur | /usr/local/gemfire/deploy/disk_store1 

gfsh>revoke missing-disk-store --id=60399215-532b-406f-b81f-9b5bd8d1b55a
Note: This gfsh commands require that you are connected to the distributed system via a JMX Manager node.