Pages

Tuesday, April 26, 2011

RAID 01 Vs RAID10

Difference between RAID 0+1 vs RAID 1+0?


We have covered RAID levels before in our posts. You can read about the different RAID levels here and the I/O characteristics here.  While building up a DR (Disaster Recovery) environment for one of our clients, one of the questions asked by the client was: “How is RAID 1+0 different than RAID 0+1?”.  Both RAID 0+1 and RAID 1+0 are multiple RAID levels which means that they are created by taking a number of disks and then dividing them up into sets. And within each of these sets, a single RAID level is applied to it in order to form the arrays.  Then, the second RAID level is applied at the top of it to form the nested array.  RAID 1+0 is also called as a stripe of mirrors and RAID 0+1 is also called as a mirror of stripes based on the nomenclature used for RAID 1 (mirroring) and RAID 0 (striping).  Let’s follow this up with an example:
Suppose that we have 20 disks to form the RAID 1+0 or RAID 0+1 array of 20 disks.
a) If we chose to do RAID 1+0 (RAID 1 first and then RAID 0), then we would divide those 20 disks into 10 sets of two.  Then we would turn each set into a RAID 1 array and then stripe it across the 10 mirrored sets.
b) If on the other hand, we choose to do RAID 0+1 (i.e. RAID 0 first and then RAID 1), we would divide the 20 disks into 2 sets of 10 each.  Then, we would turn each set into a RAID 0 array containing 10 disks each and then we would mirror those two arrays.
So, is there a difference at all?  The storage is the same, the drive requirements are the same and based on the testing also, there is not much difference in performance either.  The difference is actually in the fault tolerance.  Let’s look at the two steps that we mentioned above in more detail:
RAID 1+0:
Drives 1+2     = RAID 1 (Mirror Set A)
Drives 3+4     = RAID 1 (Mirror Set B)
Drives 5+6     = RAID 1 (Mirror Set C)
Drives 7+8     = RAID 1 (Mirror Set D)
Drives 9+10     = RAID 1 (Mirror Set E)
Drives 11+12     = RAID 1 (Mirror Set F)
Drives 13+14     = RAID 1 (Mirror Set G)
Drives 15+16     = RAID 1 (Mirror Set H)
Drives 17+18     = RAID 1 (Mirror Set I)
Drives 19+20     = RAID 1 (Mirror Set J)
Now, we do a RAID 0 stripe across sets A through J.  If drive 5 fails, then only the mirror set C is affected.  It still has drive 6 so it will continue to function and the entire RAID 1+0 array will keep functioning.  Now, suppose that while the drive 5 was being replaced, drive 17 fails, then also the array is fine because drive 17 is in a different mirror set.  So, bottom line is that in the above configuration atmost 10 drives can fail without effecting the array as long as they are all in different mirror sets.
Now, let’s look at what happens in RAID 0+1:
RAID 0+1:
Drives 1+2+3+4+5+6+7+8+9+10        = RAID 0 (Stripe Set A)
Drives 11+12+13+14+15+16+17+18+19+20    = RAID 0 (Stripe Set B)
Now, these two stripe sets are mirrored.  If one of the drives, say drive 5 fails, the entire set A fails.  The RAID 0+1 is still fine since we have the stripe set B.  If say drive 17 also goes down, you are down.  One can argue that that is not always the case and it depends upon the type of controller that you have.  Say that you had a smart controller that would continue to stripe to the other 9 drives in the stripe set A when the drive 5 fails and if later on, drive 17 fails, it can use drive 7 since it would have the same data.  If that can be done by the controller, then theoretically speaking, RAID 0+1 would be as fault tolerant as RAID 1+0.  Most of the controllers do not do that though.


!Enjoy

Kuldeep Sharma

What RAID is Best for You?

What RAID is Best for You?


Most of you are familiar with the basic RAID technologies avaible out there today, but it is always good to have too much information about this topic than not enough. Here is a brief yet informative summary of the most popular hardware RAID configurations, including pros and cons for each:
RAID-0 (Striped)
  • Does not provide fault tolerance
  • Minimum number of disks required = 2
  • Usable storage capacity = 100%
  • This is the fastest of the RAID configurations from a read-write standpoint
  • Is the least expensive RAID solution because there is no duplicate data
  • Recommended use for temporary data only
RAID-1 (Mirrored)
  • Fault tolerant – you can lose multiple disks as long as a mirrored pair is not lost
  • Minimum number of disks required = 2
  • Usable storage capacity = 50%
  • Good read performance, relatively slow write performance
  • Recommended for operating system log files
RAID-5 (Striped with Parity)
  • Fault tolerant – can afford to lose one disk only
  • Minimum number of disks required = 3
  • Usable storage capacity = subtract 1 whole disk from the total number in the array (i.e. 3 60Gig hard drives would provide 120Gig of usable disk space)
  • Generally good performance, and increases with concurrency – the more drives in the array the faster the performance
  • Recommended for operating system files, shared data, and application files
RAID-0+1 (Striped with Mirrors)
  • Fault tolerant – you can lose multiple disks as long as both are not part of a mirrored pair
  • Minimum number of disks required = 4
  • Usable storage capacity = 50%
  • Generally good performance, and increases with concurrency – the more drives in the array the faster the performance
  • Recommended for operating systems, shared data, application files, and log files
Additional Things to Keep in Mind
  • If you are using more than two disks, RAID 0+1 is a better solution than RAID 1
  • Usable storage capacity increases as the amount of disks increases, but so does the cost of the configuration
  • Performance increases as you add disks, but again, so does cost

!Enjoy
Kuldeep Sharma

Thursday, April 21, 2011

Useful Linux Tips


Task
File / Command
Startup script
/etc/rc.d/rc
Kernel
/boot/vmlinuz
Kernel Parameters
sysctl -a
Reconfigure the kernel
cd /usr/src/linux 
make mrproper  
make menuconfig  
make dep  
make clean  
make bzImage 
make install 
make modules 
make modules_install
cp arch/i386/boot/bzImage /boot/vmlinuz-2.2.16 
mkinitrd /boot/initrd-2.2.16.img 2.2.16 
vi /etc/lilo.conf 
lilo
List modules
lsmod
Load module
insmod
Unload module
rmmod
Initialize system
netconf
Physical RAM
free -m
Kernel Bits
getconf LONG_BIT
Crash utility
lcrash
Trace System Calls
strace
Machine model
uname -m
OS Level
uname -r
Run Level
runlevel
Hardware Information
dmidecode
Timezone Management
/etc/sysconfig/clock
NTP Daemon
/etc/ntp.conf
/etc/rc.d/init.d/xntpd


Monday, April 18, 2011

Linux Admin Q&A

Interview Questions And Answers

Q: - How are devices represented in UNIX?

All devices are represented by files called special files that are located in /dev directory.

Q: - What is 'inode'?

All UNIX files have its description stored in a structure called 'inode'. The inode contains info about the file-size, its location, time of last access, time of last modification, permission and so on. Directories are also represented as files and have an associated inode.

Q: - What are the process states in Unix?

As a process executes it changes state according to its circumstances. Unix processes have the following states:

Running : The process is either running or it is ready to run .
Waiting : The process is waiting for an event or for a resource.
Stopped : The process has been stopped, usually by receiving a signal.
Zombie : The process is dead but have not been removed from the process table.

Q: - What command should you use to check the number of files and disk space used and each user's defined quotas?

repquota

Q: - What command is used to remove the password assigned to a group?

gpasswd -r

Q: - What can you type at a command line to determine which shell you are using?

echo $SHELL

Q: - Write a command to find all of the files which have been accessed within the last 30 days.

find / -type f -atime -30 > filename.txt

Q: - What is a zombie?

Zombie is a process state when the child dies before the parent process. In this case the structural information of the process is still in the process table.

Q: - What daemon is responsible for tracking events on your system?

syslogd

Q: - What do you mean a File System?

File System is a method to store and organize files and directories on disk. A file system can have different formats called file system types. These formats determine how the information is stored as files and directories.

Q: - Tell me the name of directory structure hierarchy for Linux

/root
/boot
/bin
/sbin
/proc
/mnt
/usr
/var
/lib
/etc
/dev
/opt
/srv
/tmp
/media  

Q: - What does /boot directory contains?

The /boot/ directory contains static files required to boot the system, such as the Linux kernel, boot loader configuration files. These files are essential for the system to boot properly.

Q: - If some one deletes /boot directory from your server, than what will happen?

In that case your server will be in unbootable state. Your Server can’t boot without /boot directory because this directory contains all bootable files

Q: - What does /dev directory contain?

The /dev directory contains all device files that are attached to system or virtual device files that are provided by the kernel.

Q: - What is the role of udev daemon?

The udev demon used to create and remove all these device nodes or files in /dev/ directory.

Q: - What kind of files or nodes /dev/ directory contains and how do I access or see device files?

Block Device Files:-

Block device files talks to devices block by block [1 block at a time (1 block = 512 bytes to 32KB)].
Examples: - USB disk, CDROM, Hard Disk

# ls /dev/sd*brw-rw----         1 root          root            8,      0 Mar 15  2009 sda
brw-rw----      1 root          root            8,      1 Mar 15  2009 sda1
brw-rw----      1 root          root            8,      2 Mar 15  2009 sda2
brw-rw----      1 root          root            8,      3 Mar 15  2009 sda3
brw-rw----      1 root          root            8,      4 Mar 15  2009 sda4
brw-rw----      1 root          root            8,      16 Mar 15  2009 sdb

Character Device Files:-

Character device files talk to devices character by character.
Examples: - Virtual terminals, terminals, serial modems, random numbers

#ls /dev/tty*crw-rw----         1 root          root            4,      64 Mar 15  2009 ttyS0
crw-rw----      1 root          root            4,      65 Mar 15  2009 ttyS1
crw-rw----      1 root          root            4,      66 Mar 15  2009 ttyS2
crw-rw----      1 root          root            4,      67 Mar 15  2009 ttyS3

Q: - Tell me the name of device file for PS/2 mouse connection.

/dev/psaux

Q: - Tell me the name of device file for parallel port (Printers).

/dev/lp0

Q: - What does /etc/X11/ directory contains?

The /etc/X11/ directory is for X Window System configuration files, such as xorg.conf.

Q: - What does /etc/skell directory contains?

The /etc/skel directory contains files and directories that are automatically copied over to a new user's home directory when such user is created by the useradd or adduser command.

Q: - Tell me name of Linux File systems?

Ext2
Ext3

Q: - What is the difference between ext2 and ext3 file systems?

The ext3 file system is an enhanced version of the ext2 file system.

The most important difference between Ext2 and Ext3 is that Ext3 supports journaling.
After an unexpected power failure or system crash (also called an unclean system shutdown), each mounted ext2 file system on the machine must be checked for consistency by the e2fsck program. This is a time-consuming process and during this time, any data on the volumes is unreachable.
The journaling provided by the ext3 file system means that this sort of file system check is no longer necessary after an unclean system shutdown. The only time a consistency check occurs using ext3 is in certain rare hardware failure cases, such as hard drive failures. The time to recover an ext3 file system after an unclean system shutdown does not depend on the size of the file system or the number of files; rather, it depends on the size of the journal used to maintain consistency. The default journal size takes about a second to recover, depending on the speed of the hardware.

Q: - Any idea about ext4 file system?

The ext4 or fourth extended filesystem is a journaling file system developed as the successor to ext3. Ext4 filesystem released as a functionally complete and stable filesystem in Linux with kernel version 2.6.28.

Features of ext4 file system:-

1. Currently, Ext3 supports 16 TB of maximum file system size and 2 TB of maximum file size. Ext4 have 1 EB of maximum file system size and 16 TB of maximum file size.

[An EB or exabyte is 1018 bytes or 1,048,576 TB]
2. Fast fsck check than ext3
3 In Ext4 the journaling feature can be disabled, which provides a small performance improvement.
4. Online defragmentation.
5. Delayed allocation
Ext4 uses a filesystem performance technique called allocate-on-flush, also known as delayed allocation. It consists of delaying block allocation until the data is going to be written to the disk, unlike some other file systems, which may allocate the necessary blocks before that step.

Q: - How we create ext3 file system on /dev/sda7 disk?

# mkfs –j /dev/sda7

Q: - Can we convert ext2 filesystem to ext3 file system?

Yes, we can convert ext2 to ext3 file system by tune2fs command.

                tune2fs –j   /dev/

Q: - Is there any data lose during conversion of ext2 filesystem to ext3 filesystem?

No

Q: - How we will create ext4 file system?

# mke2fs -t ext4 /dev/DEV

Q: - Explain /proc filesystem?

/proc is a virtual filesystem that provides detailed information about Linux kernel, hardware’s and running processes. Files under /proc directory named as Virtual files. Because /proc contains virtual files that’s why it is called virtual file system.
These virtual files have unique qualities. Most of them are listed as zero bytes in size. Virtual files such as /proc/interrupts, /proc/meminfo, /proc/mounts, and /proc/partitions provide an up-to-the-moment glimpse of the system's hardware. Others, like the /proc/filesystems file and the /proc/sys/ directory provide system configuration information and interfaces.

Q: - Can we change files parameters placed under /proc directory?

Yes
To change the value of a virtual file, use the echo command and a greater than symbol (>) to redirect the new value to the file. For example, to change the hostname on the fly, type:

echo www.nextstep4it.com > /proc/sys/kernel/hostname

Q: - What is the use of sysctl command?

The /sbin/sysctl command is used to view, set, and automate kernel settings in the /proc/sys/ directory.

Q: - Explain /proc filesystem?

/proc is a virtual filesystem that provides detailed information about Linux kernel, hardware’s and running processes. Files under /proc directory named as Virtual files. Because /proc contains virtual files that’s why it is called virtual file system.
These virtual files have unique qualities. Most of them are listed as zero bytes in size. Virtual files such as /proc/interrupts, /proc/meminfo, /proc/mounts, and /proc/partitions provide an up-to-the-moment glimpse of the system's hardware. Others, like the /proc/filesystems file and the /proc/sys/ directory provide system configuration information and interfaces.

Q: - Can we change files parameters placed under /proc directory?

Yes
To change the value of a virtual file, use the echo command and a greater than symbol (>) to redirect the new value to the file. For example, to change the hostname on the fly, type:

echo www.nextstep4it.com > /proc/sys/kernel/hostname

Q: - What is the use of sysctl command?

The /sbin/sysctl command is used to view, set, and automate kernel settings in the /proc/sys/ directory.



!Enjoy

Kuldeep Sharma

Apache Interview QA

Interview Questions And Answers

Q: - What is location of log files for Apache server ?
/var/log/httpd

Q: - What are the types of virtual hosts ?
name-based and IP-based.
Name-based virtual host means that multiple names are running on each IP address.
IP-based virtual host means that a different IP address exists for each website served. Most configurations are named-based because it only requires one IP address.

Q: - How to restart Apache web server ?
service httpd restart

Q: - How to check the version of Apache server ?
rpm -qa |grep httpd

Q: - What is meaning of "Listen" in httpd.conf file ?
Port number on which to listen for nonsecure (http) transfers.

Q: - What is DocumentRoot ?

it is a location of files which are accessible by clients. By default, the Apache HTTP server in RedHat Enterprise Linux is configured to serve files from the /var/www/html/ directory.

Q: - On which port Apache server works ?
http - port 80
https - port 443

Q: - Tell me name of main configuration file of Apache server ?
httpd.conf

Q: - On which version of apache you have worked ?

httpd-2.2.3

Q: - What do you mean by a valid ServerName directive?

The DNS system is used to associate IP addresses with domain names. The value of ServerName is returned when the server generates a URL. If you are using a certain domain name, you must make sure that it is included in your DNS system and will be available to clients visiting your site.

Q: - What is the main difference between and sections?
Directory sections refer to file system objects; Location sections refer to elements in the address bar of the Web page

Q: -What is the difference between a restart and a graceful restart of a web server?
During a normal restart, the server is stopped and then started, causing some requests to be lost. A graceful restart allows Apache children to continue to serve their current requests until they can be replaced with children running the new configuration.

Q: - What is the use of mod_perl module?
mod_perl scripting module to allow better Perl script performance and easy integration with the Web server.

Q: - If you have added “loglevel Debug” in httpd.conf file, than what will happen?
 It will give you more information in the error log in order to debug a problem.

Q: - Can you record the MAC (hardware) address of clients that access your server.
No

Q: - Can you record all the cookies sent to your server by clients in Web Server logs?
Yes, add following lines in httpd.conf file.

CustomLog logs/cookies_in.log "%{UNIQUE_ID}e %{Cookie}i" CustomLog logs/cookies2_in.log "%{UNIQUE_ID}e %{Cookie2}i"

Q: - Can we do automatically roll over the Apache logs at specific times without having to shut down and restart the server?
Yes
Use CustomLog and the rotatelogs programs
Add following line in httpd.conf file. CustomLog "| /path/to/rotatelogs /path/to/logs/access_log.%Y-%m-%d 86400" combined

Q: - What we can do to find out how people are reaching your site?
Add the following effector to your activity log format. %{Referer}

Q: - If you have only one IP address, but you want to host two web sites on your server. What will you do?
In this case I will use Name Based Virtual hosting.

ServerName 10.111.203.25
NameVirtualHost *:80


ServerName web1.test.com
DocumentRoot /var/www/html/web1



ServerName web2.test2.com
DocumentRoot /var/www/html/web2


Q: - Can I serve content out of a directory other than the DocumentRoot directory?
Yes, by using “Alias” we can do this.

Q: - If you have to more than one URL map to the same directory but you don't have multiple Alias directives. What you will do?
In this case I will use “AliasMatch” directives.
The AliasMatch directive allows you to use regular expressions to match arbitrary patterns in URLs and map anything matching the pattern to the desired URL.

Q: - How you will put a limit on uploads on your web server?
This can be achieved by LimitRequestBody directive.


LimitRequestBody 100000


Here I have put limit of 100000 Bytes

Q: - I want to stop people using my site by Proxy server. Is it possible?


Order Allow,Deny
Deny from all
Satisfy All



Q: - What is mod_evasive module?

mod_evasive is a third-party module that performs one simple task, and performs it very well. It detects when your site is receiving a Denial of Service (DoS) attack, and it prevents that attack from doing as much damage. mod_evasive detects when a single client is making multiple requests in a short period of time, and denies further requests from that client. The period for which the ban is in place can be very short, because it just gets renewed the next time a request is detected from that same host.

Q: - How t to enable PHP scripts on your server?
If you have mod_php installed, use AddHandler to map .php and .phtml files to the PHP handler. AddHandler application/x-httpd-php .phtml .php


Q: - Which tool you have used for Apache benchmarking?
ab (Apache bench)
ab -n 1000 -c 10 http://www.test.com/test.html


Q: - Can we cache files which are viewed frequently?
Yes we can do it by using mod_file_cache module.
CacheFile /www/htdocs/index.html


Q: - Can we have two apache servers having diff versions?
Yes, you can have two different apache servers on one server, but they can't listen to the same port at the same time.Normally apache listens to port 80 which is the default HTTP port. The second apache version should listen to another port with the Listen option in httpd.conf, for example to port 81.

For testing a new apache version before moving your sites from one version to another, this might be a good option.You just type www.example.com:81 in the browser window and you will be connected to the second apache instance.


!Enjoy
Kuldeep Sharma

Redirect thread dump to another file?

On Jboss or Tomcat application server, we usually use kill -3 PID to get thread dump to default STDOUT which is catalina.out under $Tomcat_Home/logs folder. It might be nature to use command kill -3 PID > some.file 2>&1 to try to redirect the thread dump info to some.file than default one. However, it will not work. The reason is kill is just a command to send a signal to a process. You are redirecting the output of the kill command itself rather than the process (what the process does upon receipt of a signal is separate), so the redirect (supposed to kill command itself) has no effect on which file the process (PID) will write to. Given that, if we need redirect thread dump for that process to some other file, we need add redirects to that process when it starts.

Another popular way is to use jstack -F PID to get the whole thread dump forcefully."jstack": A JVM troubleshooting tool that prints stack traces of all running threads of a given JVM process, a Java core file, or remote debug server. It comes with JDK so it is free too. :-) 

jstack

If installed/available, we recommend using the jstack tool. It prints thread dumps to the command line console.
To obtain a thread dump using jstack, run the following command:
jstack
You can output consecutive thread dumps to a file by using the console output redirect/append directive:
jstack >> threaddumps.log

jstack script

Here's a script, taken from eclipse.org that will take a series of thread dumps using jstack.

#!/bin/bash
if [ $# -eq 0 ]; then
    echo >&2 "Usage: jstackSeries [ [ ] ]"
    echo >&2 "    Defaults: count = 10, delay = 0.5 (seconds)"
    exit 1
fi
pid=$1          # required
user=$2         # required
count=${3:-10}  # defaults to 10 times
delay=${4:-0.5} # defaults to 0.5 seconds
while [ $count -gt 0 ]
do
    sudo -u $user jstack -l $pid >jstack.$pid.$(date +%H%M%S.%N)
    sleep $delay
    let count--
    echo -n "."
done


Just run it like this:
sh jstackSeries.sh [pid] [cq5serveruser] [count] [delay]
For example:
sh jstackSeries.sh 1234 cq5serveruser 10 3


!Enjoy
Kuldeep Sharma

Taking Thread Dump in Linux

Generating a Thread Dump on Linux, including Solaris and other Unixes

1.) Identify the java process that JIRA is running in: This can be achieved by running a command similar to:

[root@server2~]#ps -ef | grep java
root      8876  8854 12 10:54 pts/5    00:08:37 /usr/jdk1.6.0_16/bin/java -Dprogram.name=run.sh -server -Xms256m -Xmx1024m -XX:MaxPermSize=256m -Dorg.jboss.resolver.warning=true -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djava.net.preferIPv4Stack=true -Djava.endorsed.dirs=/data2/jboss/lib/endorsed -classpath /data2/jboss/bin/run.jar:/usr/jdk1.6.0_16/lib/tools.jar org.jboss.Main -c all -b 192.168.2.102 -Djboss.messaging.ServerPeerID=1
root     10154  9577  0 12:02 pts/3    00:00:00 grep java
 

2.) Find the process ID of the JVM and use the ps command to get list of all processes: 

kill -3 
The thread dump will be printed to Confluence's standard output.
 
Output will be something Like this:
Full thread dump Java HotSpot(TM) Server VM (14.2-b01 mixed mode):

"RMI TCP Connection(idle)" daemon prio=10 tid=0x09c19c00 nid=0x2776 waiting on condition [0x5f76f000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x6d21b708> (a java.util.concurrent.SynchronousQueue$TransferStack)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
        at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424)
        at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
        at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)

"OOB-104,192.168.2.102:56221" prio=10 tid=0x62c7b000 nid=0x2764 waiting on condition [0x5eb69000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x75208848> (a java.util.concurrent.SynchronousQueue$TransferStack)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:422)
        at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
        at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:857)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)

 
!Enjoy
Kuldeep Sharma 

Java Thread Dump

Understanding a Java thread dump


1. Introduction

In my opinion, one of the greatest things about Java is the ability to get meaningful thread dumps on a running production environment without having to enable DEBUG mode. The thread dump is a snapshot of exactly what's executing at a moment in time. While the thread dump format and content may vary between the different Java vendors, at the bare minimum it provides you a list of the stack traces for all Java threads in the Java Virtual Machine. Using this information, you can either analyze the problem yourself, or work with those who wrote the running code to analyze the problem.

2. What is a stack trace?

I mentioned earlier that the thread dump is just a list of all threads and the full stack trace of code running on each thread. If you are a J2EE Application Server administrator and you've never done development before, the concept of a stack trace may be foreign to you. A stack trace is a dump of the current execution stack that shows the method calls running on that thread from the bottom up. If you're unfamiliar with what a method is, please see:http://en.wikipedia.org/wiki/Subroutine.
Here is an example stack trace for a thread running in WebLogic:
"ExecuteThread: '2' for queue: 'weblogic.socket.Muxer'" daemon prio=1 tid=0x0938ac90 nid=0x2f53 waiting for monitor entry [0x80c77000..0x80c78040]
 at weblogic.socket.PosixSocketMuxer.processSockets(PosixSocketMuxer.java:95)
 - waiting to lock <0x8d3f6df0> (a weblogic.socket.PosixSocketMuxer$1)
 at weblogic.socket.SocketReaderRequest.run(SocketReaderRequest.java:29)
 at weblogic.socket.SocketReaderRequest.execute(SocketReaderRequest.java:42)
 at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:145)
 at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:117)
You don't need to understand what it's doing at the moment, but the key point to remember is that it is bottom up. This means that it started with weblogic.kernel.ExecuteThread.run(ExecuteThread.java:117), that method called the "execute" method above it which called the one above it and so on.
The key here is knowing that what is currently running (or waiting) is always the top method. This will give you insight as to what the threads are stuck on. I usually glance through the whole stack trace because developers usually provide clues in method names as to what is actually going on. In the above thread, you'll notice that the method currently running (or in this case, waiting on a java lock) is weblogic.socket.PosixSocketMuxer.processSockets(). Socket Muxer is a term used by BEA and others to refer to managing data on network sockets (data sent across the network interface)

3. Thread pools

Most application servers use thread pools to manage execution tasks of a certain type. A thread pool is merely a collection of threads set aside for a specific task. In the example thread above, I've shown you a thread from the WebLogic thread pool (or queue) named "weblogic.socket.Muxer". A pool of these "Muxer threads" is set aside by WebLogic to manage reading and writing data for network connections coming into WebLogic.
In most cases, you won't be looking at the Muxer threads. When someone reports that the application server isn't responding, it usually means that the application code deployed to the application server isn't working right - and you'll need to figure out why. So you'll need to identify the thread pool that your application code runs on and find those threads in the Java thread dump to see what's going on.
Unless you've customized the execute queue/thread pool that your application gets deployed too, you'll want to look for the given application servers "Default" execute queue. In WebLogic, you'll look for the threads marked as 'weblogic.kernel.Default' to see whats running.
For WebLogic 10, here is an example of just such a thread:
"[ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=1 tid=0x091962f8 nid=0x2f95 in Object.wait() [0x7cd75000..0x7cd75ec0]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x8ed19d28> (a weblogic.work.ExecuteThread)
 at java.lang.Object.wait(Object.java:474)
 at weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:156)
 - locked <0x8ed19d28> (a weblogic.work.ExecuteThread)
 at weblogic.work.ExecuteThread.run(ExecuteThread.java:177)
In WebLogic, when you see the keyword "waitForRequest" in a particular thread's stack trace, it means that the thread is idle and could potentially do work if a new request came in. Having idle threads is usually a good thing, but it doesn't always guarantee that your application is healthy or responsive. It just means that you have the potential for more throughput.
Once you know which execute queue is used for the unresponsive application code, you know exactly where to look to figure out what's going on.

4. Getting a thread dump

There are quite a few ways to obtain a thread dump from a running virtual machine. I've found that the easiest way to get a thread dump is to send the HUP or BREAK signal to the java process. In Unix, this is done by figuring out the PID (process ID) of the Java virtual machine and doing a "kill -3 ".
In Windows, you can use the SendSignal.exe program to send a "break" signal to the process and obtain a thread dump.
Where you go to find the thread dump usually depends on Java implementation. For most vendors, you'll go to the "standard out" log file. In WebLogic, it's often referred to as "weblogic.out", "nohup.out" or something you've created yourself by redirecting standard output to a file. This is why it's critical to redirect standard output & standard error to a log file so it is captured when you need a thread dump. See http://www.javasanity.org/basicsetup for information on how to do this.
For the IBM Java, the thread dump gets printed to a file with a prefix of "javacore" in it's name. The file gets written to the Java process "working directory". You may have to do some digging to find it.

5. Full Thread dump

Take a look at this example thread dump taken from WebLogic 10 while running some sample requests. You can look at this to get familiar with the format of a WebLogic thread dump running in the Sun JVM. Take a moment to search for the default execute queue and figure out what the code was doing when this thread dump was taken Don't read the next section until you're sure you want the answer.

6. Summary

Being able to obtain and analyze Java thread dumps is critical for understanding what's really going in inside your J2EE application server. Without thread dumps, you're blind when it comes to trying to get to root cause for an application server "hang" condition. I'd suggest always getting a thread dump before restarting a hung or misbehaving application server. It might not always be useful, but it doesn't hurt to get one in case it's needed later.
If you took the time to analyze the full thread dump in section 5, I hope you took the time to find the threads marked "weblogic.kernel.Default" to see what's going on. In this thread dump, there are only two types of requests running on the Default queue. There are idle threads (which are not the issue) and there are threads running my "insert_test.jsp" java server page.
If you found that they all were doing something within "Oracle" routines, kudos to you. If you understand network programming, you might even have realized that when the top function is doing a "socketRead", it's waiting on data to come across the network. Seeing most of your executing threads waiting on data from the network is usually a pointer to the next place you need to look to identify the cause of your problem.
What I provided here was a simulation of database issue. In this case, all running "Default" threads were waiting on a response from the database. This response was never going to come, because I locked the table in exclusive mode and the jsp was trying to do an insert. I solved this by killing the lock - this allowed the threads to complete. A poorly tuned database (missing or corrupt indexes, SGA, disk I/O, high CPU on the DB, etc) can cause a similar "socketRead" bottleneck within the jdbc drivers for your application. I've also found that poorly designed SQL code or tables that were too large and never purged can cause this as well.

A special thanks to author of original post.
Regards
!Kuldeep Sharma

Wednesday, April 6, 2011

How to log output of remote ssh session ?

There are many instances when you are going to ssh to remote server for troubleshooting and data gathering purposes and you want to save those data in your computer.
There is a less frequently but useful "tee" command which could be used to log all output in a remote ssh session. What it will actually do is that it will generate one file which will capture all the commands as well as their output.

ssh user@remote.server.com | tee /path/of/log/file


This command is very useful for troubleshooting purposes.



!Enjoy
Kuldeep Sharma

Friday, April 1, 2011

Some Interview Stuff


1)Define a deamon?
Ans: In Unix or Other Multitasking Operating Systems a Daemon is a computer program that runs in Background, rather than under the control of a user. These are usually Intiated as Background processes.Typically Daemons have names that ends with "d" e.g. mysqld,syslogd,pptpd,sshd etc.

2.) Can we use crontab to run a script per second? if yes how? if no why?
Ans: I think we can't. In crontab (the one the user can edit) the smallest timeperiod is 1 minute. The crontab deamon, which checks the crontab file, runs every 30 seconds.
         But we can make use of some scripting to run script after some seconds but not every second.

3)I have a file each line contains “/var/www/html”, replace this entry in all lines with “/home/xyz/red”  with single command or an editor.
Ans: For this make the use of sed(Stream Editor) command by using some other separator than "/".
   AS :   #sed -i 's#/var/www/html#/home/xyz/red#g'

4)Please give me syntax for a FOR loop in Bash scripting(please specify 2 types too).
Ans:
   1.for (( EXP1; EXP2; EXP3 ))
    do
        command1
        command2
        command3
    done
e.g.    
    for (( c=1; c<=5; c++ ))
    do
        echo "Welcome $c times..."
    done
   2.)  for VARIABLE in 1 2 3 4 5 .. N
    do
        command1
        command2
        commandN
    done
e.g.
    for i in 1 2 3 4 5
    do
       echo "Welcome $i times"
    done   

5) How to create a deamon with shell scripting?
Ans : Creating a daemon is easy in shell script called monitor.sh
Code:

#!/bin/bash
while :
do
    # add code to monitor dir here and take same action
done

Run monitor.sh in background or use nohup or friends:
Code:

nohup monitor.sh &

6)How can i check exit status of my previous command?
Ans: #echo $?
  If it gives 0 then last command you have run is executed successfully.
    Whenever you execute a command in Linux, the command actually returns an exit status code which is useful when you need information on how the execution went like if it failed or succeeded. An exit status code of:

    * 0 means the command executed properly without any errors
    * 1 means the command executed but there may have been some minor issues (think of these as warnings)
    * 2 means the command executed and there was a major problem (think of these as errors)

    * Getting the exit codes is easy, simply execute any command in Linux. Immediately after, to check its exit status, type

echo $?

and you will see a number printed, which will be 0, 1, or 2.





!Enjoy
Kuldeep Sharma