Discussion:
More SAPDB connect issues - DBMCLI concurrent
Stephen Gutknecht (SAPDB)
2002-10-04 08:53:52 UTC
Permalink
As some of you know, I'm pulling my hair out trying to track down why we
have both driver crashes and complete stalls of the SAPDB software. On our
production web sites, about every 24 hours we run into a stall of the
database server or a crash of the driver. Stalls can last as long as 5
minutes where SAPDB doesn't respond. We believe we have eliminated all
hardware / network issues (yes, even DNS).

Our site has a lot of concurrency, so I've been trying to devise "stress
tests" that do a lot of small transactions over and over like we do on our
web site. The problem seems to only crop up after accumulated usage,
pointing to small leaks or other problems...

A week or so ago I posted a command line dotNet stress test program that
could pretty quickly generate errors out of the ODBC driver. Most common
were -709 errors.

Now I have a new symptom of the problem. While running my web site testing
today, I was looking at DBMCLI and found that I once got hit with an error
from DBMCLI! So I decided to stress test DBMCLI.

Following is a batch file to run DBMCLI in a loop and have it stop once it
hits an error.


==== BEGIN Win32 BATCH FILE =====

@ECHO OFF
REM ***
REM *** Change the following line to be unique for each run
REM ***
SET O=lasterr1.txt
SET Udbm=dbm,dbm
SET DB1=TST
SET A=
:Top
SET A=%A%!
IF %A%==!!!!!!!!!!!!!!!!!! GOTO ShowOne

dbmcli -n localhost -d %DB1% -u %Udbm% -uSQL -c info state > %O%
IF ERRORLEVEL 1 GOTO ERROR1
GOTO Top

:ShowOne
ERASE %0%
dbmcli -n localhost -d %DB1% -u %Udbm% -uSQL -c info state
IF ERRORLEVEL 1 GOTO ERROR1

rem *** ping a host you can't find to simulate a sleep comand.
ping -w 1000 -n 1 192.187.188.2

SET A=
GOTO Top

:ERROR1
ECHO ***
ECHO *** Error encountered!
ECHO ***
IF EXIST %0% TYPE %O%

==== END Win32 BATCH FILE =====

Works great, runs for hours without problem.

The problem starts when you go ahead and open up a second CMD.EXE prompt and
run a second instance of the batch file at the same time.

IMPORTANT: To run more than one test at the same time, you need to make a
second copy of the BATCH file and revise the O parameter on line 5. The O
parameterneeds to be unique for each (concurrent) instance. Example:
TEST1.BAT have O=test1.txt and TEST2.BAT have O=test2.txt on line 5.

After only a few minutes, I start getting errors like:

1. Error! Connection failed to node localhost for database TST:
ERR_USRFAIL: user authorization failed

2. -24988,ERR_SQL: sql error
-4008,Unknown user name/password combination

3. Error! Connection failed to node localhost for database TST: could not
connect to socket [10048]

4. The syntax of the command is incorrect.

Or just run ONE instance at the same time you have a looping ODBC
application running, and you start getting these random errors from the ODBC
side:

ERROR [08001] [SAP AG][SQLOD32 DLL][SAP DB]Unable to connect to data
source;-709 CONNECT: (could not connect to socket [10048])

Correct me if I'm wrong, but the DBMCLI isn't using ODBC, so is the problem
deeper within SAPDB than the ODBC driver? I notice DBMCLI produces
different errors depending on how long things have been running. If you
reboot and start fresh, it takes a while for errors to appear -- and the
pattern of errors seems to change after the tests have been running for some
time. Leaks?

Please, if you have time, try to reproduce and track down these problems.

Anyone have time to test Linux for the same problem?

Thank you.

Stephen Gutknecht
Renton, Washington USA
Dittmar, Daniel
2002-10-04 09:01:47 UTC
Permalink
Could you rerun the test by omitting the -n localhost option? This bypasses socket communication. If the same errors occur we can at least rule out a problem with the xserver.

Some of the errors occuring with dbmcli happen before the dbm server connects to the database. So unless there is really a problem with the socket communication, those errors and the ODBC problem are probably unrelated.

Daniel Dittmar
--
Daniel Dittmar
SAP DB, SAP Labs Berlin
***@sap.com
http://www.sapdb.org/
Stephen Gutknecht (SAPDB)
2002-10-04 17:31:28 UTC
Permalink
Hi Daniel,

Even with no -n parameter, still happens.

There is a bug in my test script related to concurrency :) I fixed it.
Following is a revised version.


=== BEGIN Win32 BATCH FILE Script ====

@ECHO OFF
SETLOCAL
REM ***
REM *** Change the following O= line to be unique for each run
REM *** You can also revise the Udbm and DB1 params to match SAPDB
installation.
REM ***
SET O=lasterr1.txt
SET Udbm=dbm,dbm
SET DB1=TST
SET A=
:Top
SET A=%A%!
IF %A%==!!!!!!!!!!!!!!!!!! GOTO ShowOne

dbmcli -d %DB1% -u %Udbm% -uSQL -c info state > %O%
IF ERRORLEVEL 1 GOTO ERROR2
GOTO Top

:ShowOne
ERASE %0%
dbmcli -d %DB1% -u %Udbm% -uSQL -c info state
IF ERRORLEVEL 1 GOTO ERROR1

rem *** ping a host you can't find to simulate a sleep comand.
ping -w 1000 -n 1 192.187.188.2

SET A=
GOTO Top

:ERROR1
ECHO ***
ECHO *** Error encountered!
ECHO *** Should show above.
ECHO ***
GOTO Done

:ERROR2
ECHO ***
ECHO *** Error encountered!
ECHO *** Output of last execution:
ECHO ***
TYPE %O%
GOTO Done

:Done

===== END Win32 BATCH File Script =====


Forgot my "SETLOCAL".

When you run multiple copies, you need to revise the SET O= line to a unqiue
file per instance. Line #7 of the batch file above.

Example instructions to reproduce:

1. Copy this batch file from your e-mail to c:\test1.bat
2. Copy test1.bat to test2.bat.
3. Edit test2.bat and change line #7 to "SET O=lasterr2.txt"
4. Open two CMD.EXE shells (using Start/Run) on Windows XP Professional or
Windows 2000 Professional / Server.
5. Start test1.bat in the first CMD.EXE shell, start test2.bat in the second
CMD.EXE session.

Within a few minutes I get a:

ERR
-24988,ERR_SQL: sql error
-4008,Unknown user name/password combination

As also noted in my original post, if I run a looping ODBC.NET program and
one instance of this batch file at the same time.... I get socket errors on
the ODBC side.

Thanks.

Stephen


-----Original Message-----
From: Dittmar, Daniel [mailto:***@sap.com]
Sent: Friday, October 04, 2002 2:02 AM
To: 'Stephen Gutknecht (SAPDB)'; ***@listserv.sap.com
Subject: RE: More SAPDB connect issues - DBMCLI concurrent


Could you rerun the test by omitting the -n localhost option? This bypasses
socket communication. If the same errors occur we can at least rule out a
problem with the xserver.

Some of the errors occuring with dbmcli happen before the dbm server
connects to the database. So unless there is really a problem with the
socket communication, those errors and the ODBC problem are probably
unrelated.

Daniel Dittmar
--
Daniel Dittmar
SAP DB, SAP Labs Berlin
***@sap.com
http://www.sapdb.org/
Simon Matter
2002-10-07 10:25:46 UTC
Permalink
Post by Stephen Gutknecht (SAPDB)
Hi Daniel,
Even with no -n parameter, still happens.
There is a bug in my test script related to concurrency :) I fixed it.
Following is a revised version.
I tried the following script on RedHat 7.2 / SAPDB 7.3.0.25 whithin 2
hours wthout any problem. The script basically run's 10 concurrent
dmbcli sessions and appends output to a common log.

Simon

----------------------------------------------------------------------------
#!/bin/sh

# Where to find the executables
IND_PROG_DBROOT=""
if [ -f /var/spool/sql/ini/SAP_DBTech.ini ]; then
IND_PROG_DBROOT=`grep '^IndepPrograms=' \
/var/spool/sql/ini/SAP_DBTech.ini | sed 's:IndepPrograms=::g'`
else
exit 0
fi

# Binaries we need
DBMCLI=$IND_PROG_DBROOT/bin/dbmcli

STRESSDB=TST
STRESSUSER="dbm,dbm"

docli() {
while true; do
$DBMCLI -d $STRESSDB -u $STRESSUSER -uSQL -c info state >>
stresscli.log
if [ $? -ne 0 ]; then
echo "Error encountered!"
exit 1
fi
done
}

docli &
docli &
docli &
docli &
docli &
docli &
docli &
docli &
docli &
docli &
tail -f stresscli.log
----------------------------------------------------------------------------
Post by Stephen Gutknecht (SAPDB)
=== BEGIN Win32 BATCH FILE Script ====
@ECHO OFF
SETLOCAL
REM ***
REM *** Change the following O= line to be unique for each run
REM *** You can also revise the Udbm and DB1 params to match SAPDB
installation.
REM ***
SET O=lasterr1.txt
SET Udbm=dbm,dbm
SET DB1=TST
SET A=
:Top
SET A=%A%!
IF %A%==!!!!!!!!!!!!!!!!!! GOTO ShowOne
dbmcli -d %DB1% -u %Udbm% -uSQL -c info state > %O%
IF ERRORLEVEL 1 GOTO ERROR2
GOTO Top
:ShowOne
ERASE %0%
dbmcli -d %DB1% -u %Udbm% -uSQL -c info state
IF ERRORLEVEL 1 GOTO ERROR1
rem *** ping a host you can't find to simulate a sleep comand.
ping -w 1000 -n 1 192.187.188.2
SET A=
GOTO Top
:ERROR1
ECHO ***
ECHO *** Error encountered!
ECHO *** Should show above.
ECHO ***
GOTO Done
:ERROR2
ECHO ***
ECHO *** Error encountered!
ECHO ***
TYPE %O%
GOTO Done
:Done
===== END Win32 BATCH File Script =====
Forgot my "SETLOCAL".
When you run multiple copies, you need to revise the SET O= line to a unqiue
file per instance. Line #7 of the batch file above.
1. Copy this batch file from your e-mail to c:\test1.bat
2. Copy test1.bat to test2.bat.
3. Edit test2.bat and change line #7 to "SET O=lasterr2.txt"
4. Open two CMD.EXE shells (using Start/Run) on Windows XP Professional or
Windows 2000 Professional / Server.
5. Start test1.bat in the first CMD.EXE shell, start test2.bat in the second
CMD.EXE session.
ERR
-24988,ERR_SQL: sql error
-4008,Unknown user name/password combination
As also noted in my original post, if I run a looping ODBC.NET program and
one instance of this batch file at the same time.... I get socket errors on
the ODBC side.
Thanks.
Stephen
-----Original Message-----
Sent: Friday, October 04, 2002 2:02 AM
Subject: RE: More SAPDB connect issues - DBMCLI concurrent
Could you rerun the test by omitting the -n localhost option? This bypasses
socket communication. If the same errors occur we can at least rule out a
problem with the xserver.
Some of the errors occuring with dbmcli happen before the dbm server
connects to the database. So unless there is really a problem with the
socket communication, those errors and the ODBC problem are probably
unrelated.
Daniel Dittmar
--
Daniel Dittmar
SAP DB, SAP Labs Berlin
http://www.sapdb.org/
_______________________________________________
sapdb.general mailing list
http://listserv.sap.com/mailman/listinfo/sapdb.general
--
Simon Matter Tel: +41 61 695 57 35
Fr.Sauter AG / CIT Fax: +41 61 695 53 30
Im Surinam 55
CH-4016 Basel [mailto:***@ch.sauter-bc.com]
Loading...