LDOM troubleshooting: Unable to start ldmd (Ldom Deamon).
LDOM troubleshooting: Unable to start ldmd (Ldom Daemon).
This week I was doing some testing on Ldoms to provide some important information about LDOM’s to our GurkulIndia followers. During initial phases I have met some nice troubleshooting which I would like to share with you all.
Below are the system patch level, package levels and firmware levels:
yogesh#uname -a
SunOS yogesh 5.10 Generic_147440-01 sun4v sparc SUNW,Sun-Fire-T200
yogesh#pkginfo -l | grep -i ldm
PKGINST: SUNWldm
PKGINST: SUNWldmib
yogesh#pkginfo -l | grep -i SUNWldm
PKGINST: SUNWldm
PKGINST: SUNWldmib
yogesh#pkginfo -l SUNWldm
PKGINST: SUNWldm
NAME: Logical Domains Manager
CATEGORY: application
ARCH: sparc.sun4v
VERSION: 1.0.3,REV=2008.04.22.23.09
BASEDIR: /
VENDOR: Sun Microsystems, Inc.
PSTAMP: svlpen-on10-020090202174032
INSTDATE: Apr 11 2012 05:34
STATUS: completely installed
FILES: 47 installed pathnames
6 shared pathnames
14 directories
13 executables
25385 blocks used (approx)
yogesh#cd /opt/SUNWldm/bin
yogesh#ls
ldm ldmd ldmd_start schemasyogesh#prtconf -V
OBP 4.26.1 2007/04/02 16:26
yogesh#sc> showhost
Sun-Fire-T2000 System Firmware 6.4.6 2007/06/24 18:43Host flash versions:
Hypervisor 1.4.1 2007/04/02 16:37
OBP 4.26.1 2007/04/02 16:26
POST 4.26.0 2007/03/26 16:45sc> showsc version -v
Advanced Lights Out Manager CMT v1.4.2
SC Firmware version: CMT 1.4.2
SC Bootmon version: CMT 1.4.2VBSC 1.4.3
VBSC firmware built Jun 24 2007, 17:18:36SC Bootmon Build Release: 01
SC bootmon checksum: 5C67AFFC
SC Bootmon built Jun 24 2007, 17:34:19SC Build Release: 01
SC firmware checksum: F0272F89SC firmware built Jun 24 2007, 17:34:33
SC firmware flashupdate TUE JUL 15 14:06:52 2008SC System Memory Size: 32 MB
SC NVRAM Version = 14
SC hardware type: 4FPGA Version: 4.2.4.7
When I was trying to start the main ldom daemon, it was not starting though every thing was looking good from OS perspective. There was nothing found on the SUN site for troubleshooting this error the only option which was provided is to migrated the Ldom to 1.2 or later. Below is the error which I was getting while starting the ldmd:
yogesh#/opt/SUNWldm/bin/ldmd_start
Added to channel ds, service fma-mem-service, versions 1.0
Added to channel ds, service dr-cpu, versions 1.0
Added to channel fmactl, service fma-phys-mem-service, versions 1.0
Added to channel ds, service md-update, versions 1.0
Added to channel ds, service domain-shutdown, versions 1.0
Added to channel ds, service domain-panic, versions 1.0
Added to channel ds, service fma-cpu-service, versions 1.0
Added to channel fmactl, service fma-pri-service, versions 1.0
Added to channel fmactl, service fma-phys-cpu-service, versions 1.0
Added to channel spds, service pri, versions 1.0
Added to channel spds, service mdstore, versions 1.0
Added to channel ds, service var-config, versions 1.0
fatal error: HV MD major version mismatch: found version 0,
can only support version 1.
yogesh#
I got one workaround to get rid of this issue which havent worked but below is the workaround if someone else would like to try:
Form SC/ALOM:
SC> bootmode config=”factory-default”
SC> poweroff
SC> resetsc -y
SC> poweron
SC> bootmode config=”factory-default”SC> poweroff
Are you sure you want to power off the system [y/n]? y
SC>
SC Alert: SC Request to Power Off Host.SC Alert: Host system has shut down.
SC>sc> resetsc -y
User Requested SC ShutdownALOM BOOTMON v1.4.2
ALOM Build Release: 001
Reset register: 00000000ALOM POST 1.0
Dual Port Memory Test, PASSED.
TTY External – Internal Loopback Test
TTY External – Internal Loopback Test, PASSED.TTYC – Internal Loopback Test
TTYC – Internal Loopback Test, PASSED.TTYD – Internal Loopback Test
TTYD – Internal Loopback Test, PASSED.Memory Data Lines Test
Memory Data Lines Test, PASSED.Memory Address Lines Test
Slide address bits to test open address lines
Test for shorted address lines
Memory Address Lines Test, PASSED.Boot Sector FLASH CRC Test
Boot Sector FLASH CRC Test, PASSED.Return to Boot Monitor for Handshake
ALOM POST 1.0
Status = 00007fffReturned from Boot Monitor and Handshake
Loading the runtime image… VxWorks running.
Starting Advanced Lights Out Manager CMT v1.4.2
Copyright 2007 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.Current mode: NORMAL
Attaching network interface lo0… done.
Attaching network interface motfec0…. done.
Booting from Segment 0Sun(tm) Advanced Lights Out Manager CMT v1.4.2
Full VxDiag Tests
BASIC TOD TEST
Read the TOD Clock: WED APR 11 11:22:41 2012
Wait, 1 – 3 secondsSC Alert: SC System booted.
Read the TOD Clock: WED APR 11 11:22:43 2012
BASIC TOD TEST, PASSEDETHERNET CPU LOOPBACK TEST
50 BYTE PACKET – a 0 in field of 1′s.
50 BYTE PACKET – a 1 in field of 0′s.
900 BYTE PACKET – pseudo-random data.
ETHERNET CPU LOOPBACK TEST, PASSEDFull VxDiag Tests – PASSED
Status summary - Status = 7FFF
VxDiag - - PASSED
POST - - PASSED
LOOPBACK - - PASSEDI2C - - PASSED
EPROM - - PASSED
FRU PROM - - PASSEDETHERNET - - PASSED
MAIN CRC - - PASSED
BOOT CRC - - PASSEDTTYD - - PASSED
TTYC - - PASSED
MEMORY - - PASSED
MPC885 - - PASSEDPlease login: admin
Please Enter password: *****SC> poweron
{0} ok boot
Boot device: rootdisk File and args:
SunOS Release 5.10 Version Generic_147440-01 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Hostname: yogesh
LDAP NIS domain name is labrh
Loading smf(5) service descriptions: 1/1
NOTICE: core_log: ldmd[283] core dumped: /var/core/core_yogesh_ldmd_0_0_1334138715_283
Apr 11 11:05:16 svc.startd[10]: svc:/ldoms/ldmd:default: Method “/opt/SUNWldm/bin/ldmd_start”failed with exit status 1.
NOTICE: core_log: ldmd[353] core dumped: /var/core/core_yogesh_ldmd_0_0_1334138717_353
Apr 11 11:05:18 svc.startd[10]: svc:/ldoms/ldmd:default: Method “/opt/SUNWldm/bin/ldmd_start”failed with exit status 1.
Unauthorized access to or use of this system is prohibited.
All access and use may be monitored and recorded.
yogesh console login: Apr 11 11:48:20 yogesh genunix: NOTICE: core_log: ldmd[429] coredumped: /var/core/core_yogesh_ldmd_0_0_1334138718_429
Apr 11 11:48:20 yogesh svc.startd[10]: ldoms/ldmd:default failed: transitioned to maintenance
(see ‘svcs -xv’ for details)
At this point of time I tried removing the package, installed again but still no luck. Then I analysed the logs very carefully and checked for “HV MD major version mismatch” which take my mind towards hypervisor firmware version and I proceeded with firmware version on the server which worked and the issue got resolved. The patch which I applied on the server (latest firmware patch for t-2000) is 139434-09. This patch will update below component versions:
Sun System Firmware is composed of the following five individually
versioned components:
– Advanced Lights Out Manager CMT (ALOM-CMT)
– vBSC
– Hypervisor
– OpenBoot (OBP)
– Power On Self Test (POST)
yogesh#/opt/SUNWldm/bin/ldmd_start
Added to channel ds, service fma-mem-service, versions 1.0
Added to channel ds, service dr-cpu, versions 1.0
Added to channel fmactl, service fma-phys-mem-service, versions 1.0
Added to channel ds, service md-update, versions 1.0
Added to channel ds, service domain-shutdown, versions 1.0
Added to channel ds, service domain-panic, versions 1.0
Added to channel ds, service fma-cpu-service, versions 1.0
Added to channel fmactl, service fma-pri-service, versions 1.0
Added to channel fmactl, service fma-phys-cpu-service, versions 1.0
Added to channel spds, service pri, versions 1.0
Added to channel spds, service mdstore, versions 1.0
Added to channel ds, service var-config, versions 1.0
fatal error: HV MD major version mismatch: found version 0,yogesh#pwd
/var/tmp/139434-09yogesh#ls -lrt
total 43741
-r–r–r– 1 root root 18775 Sep 17 2010 LEGAL_LICENSE.TXT
drwxr-xr-x 2 root root 5 Jul 28 2011 Legal
-rwxr-xr-x 1 root root 8196 Jul 28 2011 sysfwdownload.README
-rwxr-xr-x 1 root root 72 Jul 28 2011 copyright
-rwxr-xr-x 1 root root 11065402 Jul 28 2011 Sun_System_Firmware-6_7_12-Sun_Fire_T2000.bin
-rwxr-xr-x 1 root root 11065402 Jul 28 2011 Sun_System_Firmware-6_7_12-SPARC_Enterprise_T2000.bin
-rwxr-xr-x 1 root root 11334 Aug 2 2011 Install.info
-rwxr-xr-x 1 root root 21308 Aug 11 2011 sysfwdownload
-rwxr-xr-x 1 root root 2066 Aug 17 2011 Sun_Fire_T2000_metadata.xml
-rw-r–r– 1 root root 13824 Aug 29 2011 README.139434-09
-rw-r–r– 1 root root 16098 Aug 29 2011 139434-09.htmlyogesh#pwd
/var/tmp/139434-09
yogesh#./sysfwdownload /var/tmp/139434-09/Sun_System_Firmware-6_7_12-Sun_Fire_T2000.bin………. (9%)………. (18%)………. (27%)………. (37%)………. (46%)………. (55%)………. (64%)………. (74%)………. (83%)………. (92%)……… (100%)
Download completed successfully.
Caution:Make sure that your virtual keyswitch setting is not in the LOCKEDposition.
You can check the setting from the System Controller CLI with the following command:
SC> showkeyswitch
If the virtual key switch is in LOCKED position you can change that with the following command:
SC> setkeyswitch -y normal
sc> flashupdate -s 127.0.0.1
SC Alert: System poweron is disabled.
………………………………………………………………………………………………………………………………………………………………..
Update complete. Reset device to use new software.
SC Alert: SC firmware was reloaded
sc> resetsc -y
User Requested SC ShutdownALOM BOOTMON v1.7.11
ALOM Build Release: 001
Reset register: 00000000ALOM POST 1.0
Dual Port Memory Test, PASSED.
TTY External – Internal Loopback Test
TTY External – Internal Loopback Test, PASSED.TTYC – Internal Loopback Test
TTYC – Internal Loopback Test, PASSED.TTYD – Internal Loopback Test
TTYD – Internal Loopback Test, PASSED.Memory Data Lines Test
Memory Data Lines Test, PASSED.Memory Address Lines Test
Slide address bits to test open address lines
Test for shorted address lines
Memory Address Lines Test, PASSED.Boot Sector FLASH CRC Test
Boot Sector FLASH CRC Test, PASSED.
Return to Boot Monitor for Handshake
ALOM POST 1.0
Status = 00007fffReturned from Boot Monitor and Handshake
Loading the runtime image… VxWorks running.
Starting Advanced Lights Out Manager CMT v1.7.11
Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
Current mode: NORMAL
Attaching network interface lo0… done.
Attaching network interface motfec0…. done.
Booting from Segment 1Oracle Advanced Lights Out Manager CMT v1.7.11
SC Alert: SC System booted.
Full VxDiag Tests
BASIC TOD TEST
Read the TOD Clock: WED APR 11 11:45:23 2012
Wait, 1 – 3 seconds
Read the TOD Clock: WED APR 11 11:45:25 2012
BASIC TOD TEST, PASSEDETHERNET CPU LOOPBACK TEST
50 BYTE PACKET – a 0 in field of 1′s.
50 BYTE PACKET – a 1 in field of 0′s.
900 BYTE PACKET – pseudo-random data.
ETHERNET CPU LOOPBACK TEST, PASSEDFull VxDiag Tests – PASSED
Status summary - Status = 7FFF
VxDiag - - PASSED
POST - - PASSED
LOOPBACK - - PASSEDI2C - - PASSED
EPROM - - PASSED
FRU PROM - - PASSEDETHERNET - - PASSED
MAIN CRC - - PASSED
BOOT CRC - - PASSEDTTYD - - PASSED
TTYC - - PASSED
MEMORY - - PASSED
MPC885 - - PASSEDPlease login: admin
Please Enter password: *****sc>
sc> poweron -c
{0} ok boot
Boot device: rootdisk File and args:
SunOS Release 5.10 Version Generic_147440-01 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Hostname: yogesh
LDAP NIS domain name is labrhUnauthorized access to or use of this system is prohibited.
All access and use may be monitored and recorded.
yogesh console login:yogesh#showrev -p | grep -i 139398
Patch: 139398-02 Obsoletes: Requires: Incompatibles: Packages: SUNWldmyogesh#svcs -a | grep -i ldm
online Apr_11 svc:/ldoms/ldmd:defaultyogesh#ps -ef | grep -i ldm
root 316 1 0 Apr 11 ? 123:40 /opt/SUNWldm/bin/ldmd
root 17103 16741 0 12:23:45 pts/1 0:00 grep -i ldmyogesh#ldm list
——————————————————————————
Notice: the LDom Manager is running in configuration mode. Configuration and
resource information is displayed for the configuration under construction;
not the current active configuration. The configuration being constructed
will only take effect after it is downloaded to the system controller and
the host is reset.
——————————————————————————
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-c- SP 32 32640M 0.3% 2d 18myogesh# exit
yogesh console login: #.
sc> showhost
Sun-Fire-T2000 System Firmware 6.7.12 2011/07/06 20:03Host flash versions:
OBP 4.30.4.d 2011/07/06 14:29
Hypervisor 1.7.3.c 2010/07/09 15:14
POST 4.30.4.b 2010/07/09 14:24sc> showsc version -v
Advanced Lights Out Manager CMT v1.7.11
SC Firmware version: CMT 1.7.11
SC Bootmon version: CMT 1.7.11VBSC 1.7.3.d
VBSC firmware built Jul 6 2011, 19:27:17SC Bootmon Build Release: 01
SC bootmon checksum: 4CB78FC8
SC Bootmon built Jul 6 2011, 19:37:05SC Build Release: 01
SC firmware checksum: C41F3325SC firmware built Jul 6 2011, 19:37:18
SC firmware flashupdate WED APR 11 11:43:02 2012SC System Memory Size: 32 MB
SC NVRAM Version = 14
SC hardware type: 4FPGA Version: 4.2.4.7
Every thing worked fine after doing firmware upgrade on server.




5 Comments on “LDOM troubleshooting: Unable to start ldmd (Ldom Deamon).”
Good One, Yogesh.
Hi Ram & yogesh,
Thnsk 4 the blog!!!!!!!
Thanks 4 the LDOM troubleshooting; but can u give some basic knowledge of the LDOM How work & all things
PLZ this will really workfull for me
Sure chetan. Yogesh is seriously looking into that part. You will see more information from him soon.
HI,
thanks so much for reply,
I waiting for LDOM!!!!!!
@Ram, thanks for your boosters all the time. @Chetan, I am working on Ldom stuff for our gurkulindia followers and will post LDOM basic very soon.