Diagnosing PPP/DOIP hangs and failure to connect
------------------------------------------------

This is best read with a monospaced font.
By: Bob McLellan
    The Little Blue Kiwi
    bobmcl@ibm.net
    www.thelittlebluekiwi.co.nz

Contents


1. Background
2. Prereq
3. Traces
4. Trace contents
5. The symptom
6. The solution

1. Background
There are a number of different reasons why PPP will fail to connect. This note describes some diagnostic tools to help fix the problem and provides sample traces of one particular failure. This note describes a fault encountered using DOIP but the same principles will apply if you are using Injoy.

2. Prereq
Make sure you have UN00980 fixpack applied to TCPIP, or at least have the latest PPP.EXE (revision1.18c) installed. Some of the earlier PPP.EXE's caused these problems all by themselves. Make sure the ETC environment variable is set in the config.sys (it is by default so most systems will have it). You will also need the latest DOIP (SLIPPM.EXE, revision 1.19) program.

3. Traces
The DOIP (Dial Other Internet Providers icon) program is a GUI frontend to the PPP.EXE and SLATTACH.EXE. SLATTACH manages the modem dialling and PPP provides the basic protocol over the phone line. Each of these programs can be invoked separately from the command line. Details of this are in the TCPIP Command Reference in the OS/2 Information Centre. Later versions of DOIP support PPP as well as SLIP and invoke PPP.EXE and SLATTACH.EXE.
Each of these two programs has a trace option which is invoked with a command line switch '-v'  for SLATTCH and 'debug' (no preceding -) for PPP'. This causes a log file to be written in the directory in the ETC environment variable, usually MPTN\ETC. The file is either SLATTACH.LOG or PPPx.LOG where x is the interface number used by PPP (usually 0).
If you invoke these programs from DOIP the logs can be obtained by clicking on the checkbox 'Debug' on the main panel of DOIP. This also increases the amount of information shown in the scrolling list box on that panel.

4. Trace contents

The two included files contain traces of a problem encountered on different interfaces of the same ISP. The problem appeared to be that one interface was slower than the other and caused PPP timeouts. This was marginal because sometimes the interface worked and other times it didn't.
The file 'ppp0-4.log' contains the trace of the failure and 'ppp0-.log' contains the OK connection. I haven't included the SLATTACH.LOG. The entry PID=xxx on each line is a note of the OS/2 process number for this session, not relevant for us.
Notice that the session has a number of sections ..
 setup
 modem connect 
 session parameters
 login
 local IP address
 data
 terminate

During the setup section a number of threads are kicked off. Because these are asynchronous the log entries will not always be in the same sequence. Some data blocks will vary in size as well.
The ppp0-4.log failure shows up in the local IP address section as a timeout while we are negotiating the address. The failing session just kept looping this section. Compare the two logs in this area.

5. The symptom

This problem showed up as a failure to connect PLUS the user could not reconnect to the modem and had to reboot. The reason for the inability to reconnect to the modem was that PPP.EXE had not fully shutdown and was still maintaining its modem connection. This is a bug in PPP.EXE that is unlikely to be fixed. This can be seen by running PSTAT to display running processes. KILLIT2.EXE would not kill PPP at this stage but PSPM did. Then we could use the modem again.

6, The solution

The way around this is to alter the PPP timeouts. This is done by creating a file PPP.CFG in MPTN\ETC and making entries which change the default timeouts. A sample is included. After this we could successfully connect to the ISP. See the TCPIP Command Reference for details.