Solaris Troubleshooting : NFS TroubleShooting
This article will help you to understand some of the basic troubleshooting instructions for NFS problems …
1. Determine the NFS version:
To determine what version and transport of NFS is currently available, run rpcinfo on the NFS server.
# rpcinfo -p | grep 100003
100003 2 udp 0.0.0.0.8.1 nfs superuser
100003 3 udp 0.0.0.0.8.1 nfs superuser
100003 2 tcp 0.0.0.0.8.1 nfs superuser
100003 3 tcp 0.0.0.0.8.1 nfs superuser
he second column above is the NFS version, the third column is the transport protocol.
Sun has implemented the following versions of NFS on it’s operating systems, for both client and server:
| OS Version | NFSv2 | NFSv3 | NFSv4 |
| SunOS | UDP | ||
| Solaris[TM] 2.4 and below | UDP | ||
| Solaris[TM] 2.5,2.6,7,8,9 | UDP and/or TCP | UDP and/or TCP | |
| Solaris[TM] 10 | UDP and/or TCP | UDP and/or TCP | TCP* |
*The UDP transport is not supported in NFSv4, as it does not contain the required congestion control methods
2. Check the Connectivity for NFS Server from NFS client:
1. Check that the NFS server is reachable from the client by running:
#/usr/sbin/ping
2. If the server is not reachable from the client, make sure that the local name service is running. For NIS+ clients:
#/usr/lib/nis/nisping -u
3. If the name service is running, make sure that the client has received the correct host information -
# /usr/bin/getent hosts
4. If the host information is correct, but the server is not reachable from the client, run the ping command from another client.
5. If the server is reachable from the second client, use ping to check connectivity of the first client to other systems on the local network. If this fails, check the networking configuration on the client. Check the following files:
/etc/hosts, /etc/netmasks, /etc/nsswitch.conf,
/etc/nodename, /etc/net/*/hosts etc.
6. If the software is correct, check the networking hardware.
Additionally you can refer the “NFS Hard mounts vs Soft Mounts”
3. From the Server, Verify Service Daemons are running
a) confirm S10 smf network nfs server services are online:
# svcs -a |grep nfs
b) statd , lockd , mountd and nfsd processes should be running:
# ps -elf |grep nfs
c) compare the times when nfsd and mountd started with the time
when rpcbind was started. The rpcbind MUST have started before the NFS Daemons.
d) verify that the NFS programs have been registered with rpcbind:
# rpcinfo -s
to confirm specific RPC service use the following commands:
# rpcinfo -t 100003
# rpcinfo -t 100005
# rpcinfo -t 100021
e) logging may be enabled (not for NFSv4).
On the client:
a) confirm S10 smf network nfs client services are online:
# svcs -a |grep nfs
b) statd , lockd should be running
# ps -elf |grep nfs
c) You can verify the server is working from the client side.
# rpcinfo -s |egrep ?nfs|mountd|lock?
# rpcinfo -u 100003
# rpcinfo -u 100005
# rpcinfo -u 100021
4. Confirm proper syntax of dfstab share entries on NFS server.
Solaris OS defines shared (or exported) filesystems in the /etc/dfs/dfstab file. The standard syntax of lines in that file is:
share [-F fstype] [ -o options] [-d ""] [resource]
For example, the following /etc/dfs/dfstab file is for a server that makes available the filesystems /usr, /var/spool/mail and /home:
share -F nfs /usr
share -F nfs /var/spool/mail
share -F nfs /home
You can add normal mount options to these lines, such as ro, rw and root. This is done by proceeding the options with a -o flag. The following example shows our /etc/dfs/dfstab file, with all filesystems shared read only:
share -F nfs -o ro /usr
share -F nfs -o ro /var/spool/mail
share -F nfs -o ro /home
To add new shares to existing ones, simply run the shareallcommand:
# shareall
This will share ALL filesystems available in the /etc/dfs/dfstab file. If you have never shared filesystems from this machine before, you
must run the nfs.server script:
# /etc/init.d/nfs.server start
This will run the shareall(1M) command and start the nfs daemons, mountd(1M), and nfsd. The “nfs.server start” procedure is also run on bootup, when the system enters run level 3.
5. Confirm file system is shared as seen on both ends.
The NFS server is the system that will share a file system. The ?showmount-e? or ?dfshares? command will display what is being shared. From the client use command with nfs server name.
# showmount -e
Note: that NFSv4 does not use mountd. If mountd is not running, showmount will not work.
6. Verify mount point exists and is in use
To display statistics for each NFS mounted file system, use the command ?nfsstat -m?. This command will also tell you which options were used when the file system was mounted. You can also check the contents of the /etc/mnttab. It should show what is currently mounted. Lastly, check the dates between the server and the client. An incorrect date may show the file created in the future causing confusion







11 Comments on “Solaris Troubleshooting : NFS TroubleShooting”
Hi,
I try to implement nfs on server but on client side , whn check nfs server the client service goes to “offline*” state.
ROOT@SAPTEST4# svcs | grep -i nfs —————– (CLIENT SIDE O/P)
online Apr_15 svc:/network/nfs/status:default
online Apr_15 svc:/network/nfs/cbd:default
online Apr_15 svc:/network/nfs/mapid:default
online Apr_15 svc:/network/nfs/nlockmgr:default
online Apr_15 svc:/network/nfs/rquota:default
disable Apr_20 svc:/network/nfs/server:default
offline* Apr_15 svc:/network/nfs/client:default
WHAT means of offline*????????
I restart svc:/network/nfs/server:default service but within 2 min its again goes to disable state
Whn i try to mounted the fs on client its completed successfully but whn i tried to create any file or dir its permission denied error.
PLZ help me………..
@Chetan, we can start client services using svcs nfs/client,not by restart nfs/server service.
as per your comment, i assume that this client service running fine on the client machine, that is the reason you are able to mount it.
And about the permission denier errror. Please check what is the permission you have on the shared directory at the source, and change it to 777 and test to write a file. If that allow you to write than the problem is with permissions. If that doesn’t allow to write than the problem is the sharing options that means you have to share the directory with proper rw access to the client. Please check the section 4 of this post.
Hi sir,
Following steps that i taken on nfs server end
===============================================================================================
######### SERVER SIDE #############
===============================================================================================
bash-3.2# share -F nfs /jumpstart
bash-3.2# shareall
bash-3.2# exportfs -va
shareall -F nfs
bash-3.2# showmount -e
export list for APPLE:
/jumpstart (everyone)
bash-3.2# ps -ef | grep -i nfs
daemon 1502 1 0 00:05:07 ? 0:00 /usr/lib/nfs/lockd
daemon 1486 1 0 00:05:06 ? 0:00 /usr/lib/nfs/statd
daemon 1483 1 0 00:05:06 ? 0:00 /usr/lib/nfs/nfs4cbd
root 1643 1 0 00:05:14 ? 0:00 /usr/lib/nfs/mountd
daemon 1484 1 0 00:05:06 ? 0:00 /usr/lib/nfs/nfsmapid
daemon 1649 1 0 00:05:14 ? 0:00 /usr/lib/nfs/nfsd
bash-3.2# svcs | grep -i nfs
online 0:05:05 svc:/network/nfs/mapid:default
online 0:05:06 svc:/network/nfs/cbd:default
online 0:05:06 svc:/network/nfs/status:default
online 0:05:07 svc:/network/nfs/nlockmgr:default
online 0:05:12 svc:/network/nfs/client:default
online 0:05:12 svc:/network/nfs/rquota:default
online 0:05:14 svc:/network/nfs/server:default
bash-3.2# dfshares
RESOURCE SERVER ACCESS TRANSPORT
APPLE:/jumpstart APPLE – -
bash-3.2# svcs | grep -i RPC
online 0:05:05 svc:/network/rpc/bind:default
online 0:05:11 svc:/network/rpc/gss:default
online 0:05:12 svc:/network/rpc/cde-calendar-manager:default
online 0:05:12 svc:/network/rpc/cde-ttdbserver:tcp
online 0:05:12 svc:/network/rpc/smserver:default
online 0:05:12 svc:/network/rpc-100235_1/rpc_ticotsord:default
bash-3.2# ping 192.168.254.11
192.168.254.11 is alive
===============================================================================================
######### CLIENT SIDE #############
===============================================================================================
bash-3.2# ps -ef | grep -i nfs
daemon 1457 1 0 00:08:32 ? 0:00 /usr/lib/nfs/statd
daemon 1481 1 0 00:08:33 ? 0:00 /usr/lib/nfs/lockd
daemon 1462 1 0 00:08:33 ? 0:00 /usr/lib/nfs/nfs4cbd
daemon 1463 1 0 00:08:33 ? 0:00 /usr/lib/nfs/nfsmapid
bash-3.2# svcs | grep -i nfs
online 0:08:32 svc:/network/nfs/status:default
online 0:08:32 svc:/network/nfs/cbd:default
online 0:08:33 svc:/network/nfs/mapid:default
online 0:08:33 svc:/network/nfs/nlockmgr:default
online 0:08:43 svc:/network/nfs/rquota:default
bash-3.2# svcs /network/nfs/server
STATE STIME FMRI
disabled 0:08:47 svc:/network/nfs/server:default
bash-3.2# svcs /network/nfs/client
STATE STIME FMRI
disabled 0:07:41 svc:/network/nfs/client:default
bash-3.2# svcadm enable /network/nfs/server
bash-3.2# svcs /network/nfs/server
STATE STIME FMRI
disabled 0:11:39 svc:/network/nfs/server:default
bash-3.2# svcadm enable svc:/network/nfs/client:default
bash-3.2# svcs /network/nfs/client
STATE STIME FMRI
offline* 0:13:49 svc:/network/nfs/client:default
bash-3.2# mount -F nfs 192.168.254.21:/jumpstart /mnt
bash-3.2# df -kh /mnt
Filesystem size used avail capacity Mounted on
192.168.254.21:/jumpstart
5.9G 2.2G 3.6G 39% /mnt
bash-3.2# ls -ld /mnt
drwxr-xr-x+ 7 root root 512 Apr 29 23:28 /mnt
bash-3.2# cd /mnt
bash-3.2# ls -la
total 62
drwxr-xr-x+ 7 root root 512 Apr 29 23:28 .
drwxr-xr-x 40 root root 1024 May 4 12:31 ..
-r-xr-xr-x 1 root root 17375 Jan 14 2005 analyze_patches
drwxr-xr-x+ 5 root root 512 Apr 29 20:52 boot
dr-xr-xr-x+ 3 root root 512 Apr 29 23:24 config
dr-xr-xr-x+ 2 root root 512 Mar 25 2008 database
drwx——+ 2 root root 8192 Apr 29 20:50 lost+found
drwxr-xr-x+ 6 root root 512 Apr 29 21:04 os
bash-3.2# mkdir test
mkdir: Failed to make directory “test”; Permission denied
bash-3.2# dfshares
nfs dfshares:MANGO: RPC: Program not registered
bash-3.2# dfshares
nfs dfshares:MANGO: RPC: Program not registered
bash-3.2# dfmounts
nfs dfmounts: can’t contact server: MANGO: RPC: Program not registered
bash-3.2# svcs | grep -i RPC
online 0:08:31 svc:/network/rpc/bind:default
online 0:08:43 svc:/network/rpc/gss:default
online 0:08:43 svc:/network/rpc/cde-calendar-manager:default
online 0:08:43 svc:/network/rpc/cde-ttdbserver:tcp
online 0:08:43 svc:/network/rpc/smserver:default
online 0:08:43 svc:/network/rpc-100235_1/rpc_ticotsord:default
bash-3.2# ping 192.168.254.21
192.168.254.21 is alive
bash-3.2# showmount -e MANGO
showmount: MANGO: RPC: Unknown host
bash-3.2# rpcinfo -s
program version(s) netid(s) service owner
100000 2,3,4 udp,tcp,ticlts,ticotsord,ticots rpcbind superuser
1073741824 1 tcp – 1
100024 1 ticots,ticotsord,ticlts,tcp,udp status superuser
100133 1 ticots,ticotsord,ticlts,tcp,udp – superuser
100021 4,3,2,1 tcp,udp nlockmgr 1
100234 1 ticotsord gssd superuser
100424 1 ticotsord – superuser
100068 5,4,3,2 ticlts – superuser
100083 1 ticotsord – superuser
100155 1 ticotsord smserverd superuser
100134 1 ticotsord ktkt_warnd superuser
100011 1 udp,ticlts rquotad superuser
100235 1 ticotsord – superuser
100099 4 ticotsord – superuser
100231 1 ticots,ticotsord,ticlts – superuser
100005 3,2,1 ticots,ticotsord,tcp,ticlts,udp mountd superuser
100003 4,3,2 tcp,udp nfs 1
100227 3,2 tcp,udp nfs_acl 1
100169 1 ticots,ticotsord,ticlts – superuser
=============================================******==================================================
sir, as u say “client service running fine on the client machine” but it’s in the offline* state it’s not in enable state 1 more thing on client side, /network/nfs/server service in disabled mode after enabing or restarting also,
but all this i can able to mount the FS but i check this ls -ld /mnt thn why it’s shomf “drwxr-xr-x+” like that o/p. i think this “+” sign of ACL implemented.
At server side i put share cmd in /etc/dfs/dfstab file also same o/p…..
Thanks
Chetan
@chetan –
On client side:
>> you dont need this to run, server is required for only server side, and it will come online if you have entry in /etc/dfs/dfstab
bash-3.2# svcadm enable /network/nfs/server
bash-3.2# svcs /network/nfs/server
STATE STIME FMRI
disabled 0:11:39 svc:/network/nfs/server:default
>> you just need the below
bash-3.2# svcadm enable svc:/network/nfs/client:default
bash-3.2# svcs /network/nfs/client
STATE STIME FMRI
offline* 0:13:49 svc:/network/nfs/client:default
but the offline* says there is some error during startup. just run the command svcs -xv and then you will find a log file path for the ntp/client service. That log will tell you why it was not starting.
>> about the + symbol below
bash-3.2# ls -ld /mnt
drwxr-xr-x+ 7 root root 512 Apr 29 23:28 /mnt
yes you have ACLs in place which are restricting you to write there.. just run “ls -ldv /jumpstart” from the server and see what permissions were actually set. you can also “getfacl /jumpstart”
you can remove ACL (setfacl -d …… command )if you are in test machine, and just try to configure the NFS
>> finally, i am little surprised that you are able to mount the client directory when the nfsclient is offline.
Can you run the command “showmount -e” instead of using hostname Mango. if that shows the server mounts, then your rpcbind able to reach the server successfully. But it may fail if you restart the box, but i have to still see the output after restarting you client machine.
sir,as u said i log file from svcs -xv cmd that i goted so o kill that process & again run the “/lib/svc/method/nfs-client start” cmd its work succesfully also start thr “mountd” demon on client side the finally take reboot of client
after that offline* issue reslove; now from both side all server client RPC server are online
bash-3.2# svcs -a | grep -i nfs
disabled 23:57:11 svc:/network/nfs/server:default
online 23:58:13 svc:/network/nfs/cbd:default
online 23:58:13 svc:/network/nfs/mapid:default
online 23:58:14 svc:/network/nfs/status:default
online 23:58:16 svc:/network/nfs/nlockmgr:default
online 23:58:21 svc:/network/nfs/rquota:default
online 0:03:15 svc:/network/nfs/client:default
=========================================================================
as u said plz find “ls -ldv /jumpstart” o/p
bash-3.2# ls -ldv /jumpstart
drwxr-xr-x 8 root root 512 May 5 00:27 /jumpstart
0:user::rwx
1:group::r-x #effective:r-x
2:mask:r-x
3:other:r-x
bash-3.2# showmount -e ——————————— at client side
no exported file systems for MANGO
=========================================================================
Still ACL issue is pending , & if i create any file or dir. thn its showing error,
bash-3.2# mount -F nfs 192.168.254.21:/jumpstart /mnt
bash-3.2#
bash-3.2#
bash-3.2# showmount -e
no exported file systems for MANGO
bash-3.2# cd /mnt/
bash-3.2#
bash-3.2#
bash-3.2# ls -ldv /mnt
drwxr-xr-x+ 8 root root 512 May 5 00:27 /mnt
0:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
/append_data/execute/delete_child/read_attributes/write_attributes
/read_acl/write_acl/synchronize:allow
1:owner@::deny
2:group@:add_file/write_data/add_subdirectory/append_data/delete_child
/write_attributes/write_acl:deny
3:group@:list_directory/read_data/execute/read_attributes/read_acl
/synchronize:allow
4:group@:add_file/write_data/add_subdirectory/append_data/delete_child
/write_attributes/write_acl:deny
5:everyone@:list_directory/read_data/execute/read_attributes/read_acl
/synchronize:allow
6:everyone@:add_file/write_data/add_subdirectory/append_data
/delete_child/write_attributes/write_acl:deny
bash-3.2# uptime
12:33am up 37 min(s), 2 users, load average: 0.01, 0.01, 0.20
bash-3.2# getfacl /jumpstart
/jumpstart: No such file or directory
bash-3.2# getfacl /mnt
# file: /mnt
# owner: root
# group: root
user::rwx
group::r-x #effective:r-x
mask:r-x
other:r-x
bash-3.2# cd /mnt
bash-3.2#
bash-3.2#
bash-3.2# ls -la
total 64
drwxr-xr-x+ 8 root root 512 May 5 00:27 .
drwxr-xr-x 40 root root 1024 May 7 23:56 ..
-r-xr-xr-x 1 root root 17375 Jan 14 2005 analyze_patches
drwxr-xr-x+ 5 root root 512 Apr 29 20:52 boot
dr-xr-xr-x+ 3 root root 512 Apr 29 23:24 config
dr-xr-xr-x+ 2 root root 512 Mar 25 2008 database
drwx——+ 2 root root 8192 Apr 29 20:50 lost+found
drwxr-xr-x+ 6 root root 512 Apr 29 21:04 os
drwxrwxrwx+ 2 root root 512 May 5 00:27 test1
bash-3.2# mkdie test1
bash: mkdie: command not found
bash-3.2# mkdiee test1
bash: mkdiee: command not found
bash-3.2# mkdier test1
bash: mkdier: command not found
bash-3.2# mkdir tet
mkdir: Failed to make directory “tet”; Permission denied
bash-3.2# touch tet
touch: cannot create tet: Permission denied
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 more req. pls write 1 blog on ACL & RBAC (basic to troubleshooting) becoz this kind of issue daily come in sys admin so it will really helpful.
Many thanks for replying,
Chetan
Glad to know that the client service problem resolved. About the ACL i will post sometime soon.
Ramdev@
Many thanks, for ur help, i think there is patch issue also becoz we logged the case with oracle & they say that there is bug in current patch that’s why ACL issue came at client side i think same issue on sol 10_U10 patch also.
Any way i am waiting for ACL & RBAC (user mangemnt ) blog ……….
Again Many thanks…………..
Chetan
We have more then 30 groups and which should pass though NFS v3. We have tried all possibilities but NFS services crashing frequently on Solaris 10.
Kindly required help to overcome default NFS group limit (16 group support by default).
Hi Arvind, you had hit by well known issue of NFS using the auth_sys( authentication method used to authenticate client connection). The problem is auth_sys cannot handle authentication for the users who are having more than 16 groups. Setting the kernel parameter NGROUPS_UMAX=32 wont help in this case.
as a workaround, I have noticed that Oracle officially recommends for ACL for the user access instead of groups.
just incase if you have chance to linux as nfs server, then one work around for the problem is running the rpc.mountd with “-g” option ( refer man for more info).
Hi Ramdev,
Thanks for the reply. Same input i received from Sun support and as per there updates it will resolve on Solaris 11 (Solaris 11 delivered in s11u1_04). I don’t know this will solve or not.
Hi Ram,
I would like to give user level access to one of the filesystem shared.
Suppose server1 is my master, server 2 is client, and /test is the filesystem shared from serve1 to server2, and /data being the mount point on client. i want only the root user and DBA to access the FS and restrict this for all others.
Could you please help me with the syntax here
Laxxi