SDL hangs on SDL_Init() when $DISPLAY is remote

Hello!

When $DISPLAY points to some display over the network, and you run app using
SDL, the app just hangs (forever) without doing anything.

I started to debug this, and I found that the hanging function is in SDL_Init().

When you call SDL_Init(SDL_INIT_VIDEO), SDL’s X11_VideoInit() is called, which
then calls X11_GetVideoModes() (src/video/x11/SDL_x11modes.c).

X11_GetVideoModes() calls XF86VidModeGetAllModeLines() which is the one
which hangs (forever).

I’m using Debian GNU/Linux sid (unstable) with XFree86 4.2.1 (debian package
4.2.1-9).

Is this a known issue or should I try to debug it more? This is reproducible
with other setups (and other people) too.

To test, do this:

export DISPLAY=some_host_on_the_network:0
<run some sdl app>

and that’s it… the app hangs forever without doing anything.

Thanks!

– Pasi K?rkk?inen

                               ^
                            .     .
                             Linux
                          /    -    \
                         Choice.of.the
                       .Next.Generation.

When $DISPLAY points to some display over the
network, and you run app using
SDL, the app just hangs (forever) without doing
anything.

I started to debug this, and I found that the
hanging function is in SDL_Init().

When you call SDL_Init(SDL_INIT_VIDEO), SDL’s
X11_VideoInit() is called, which
then calls X11_GetVideoModes()
(src/video/x11/SDL_x11modes.c).

X11_GetVideoModes() calls
XF86VidModeGetAllModeLines() which is the one
which hangs (forever).

I’m fairly new to linux and everything, but my two
cents:

Unless this function (XF86…) is supposed to freeze
(which I doubt) then this would be a bug in XFree86
(unless this function is part of the SDL code, which
I’m doubting as well…due to the different naming
format). I’d suggest upgrading: www.xfree86.org (4.3.0
is out).

If the problem continues… a bug report for xfree
would probably be in order.

Possibly related: if two SDL applications attempt to
gain fullscreen, the second one freezes. IMHO, I think
it should return with an error of some sort (being a
C++ fan, I’d say throw an exception… but I
understand some people still like C :slight_smile: ). This may
also be related to the XFree code (I’m using 4.3.0)
rather than SDL itself.

Simplest way to reproduce: create two SDL apps, and
have the first go into fullscreen, spawn the second
one, and wait awhile (couting/printfing each step).

There may be reason to have the system freeze for a
few seconds: for example, allowing one SDL fullscreen
app to start another then shut down allowing the new
app to continue. However, an indefinte freeze makes no sense.__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software

When $DISPLAY points to some display over the
network, and you run app using
SDL, the app just hangs (forever) without doing
anything.

I started to debug this, and I found that the
hanging function is in SDL_Init().

When you call SDL_Init(SDL_INIT_VIDEO), SDL’s
X11_VideoInit() is called, which
then calls X11_GetVideoModes()
(src/video/x11/SDL_x11modes.c).

X11_GetVideoModes() calls
XF86VidModeGetAllModeLines() which is the one
which hangs (forever).

I’m fairly new to linux and everything, but my two
cents:

Unless this function (XF86…) is supposed to freeze
(which I doubt) then this would be a bug in XFree86
(unless this function is part of the SDL code, which
I’m doubting as well…due to the different naming
format). I’d suggest upgrading: www.xfree86.org (4.3.0
is out).

Yes, that function is a part of XFree86.
I just wanted to let you SDL guys know about this problem (if somebody else
asks the same - or searches mailinglist archives).

I sent that mail to Debian SDL maintainers, and to Debian X development
list… so hopefully somebody from there will take a look at it.

If the problem continues… a bug report for xfree
would probably be in order.

Yep.

Possibly related: if two SDL applications attempt to
gain fullscreen, the second one freezes. IMHO, I think
it should return with an error of some sort (being a
C++ fan, I’d say throw an exception… but I
understand some people still like C :slight_smile: ). This may
also be related to the XFree code (I’m using 4.3.0)
rather than SDL itself.

Simplest way to reproduce: create two SDL apps, and
have the first go into fullscreen, spawn the second
one, and wait awhile (couting/printfing each step).

There may be reason to have the system freeze for a
few seconds: for example, allowing one SDL fullscreen
app to start another then shut down allowing the new
app to continue. However, an indefinte freeze makes no sense.

I was running just one SDL app, so it’s not about trying to run a couple of
fullscreen SDL apps at the same time… and the app I was running is windowed.

And the hang happens before any SDL window is even created! (when SDL is
querying for available modelines).

Thanks.

– Pasi K?rkk?inen

                               ^
                            .     .
                             Linux
                          /    -    \
                         Choice.of.the
                       .Next.Generation.On Sun, Aug 17, 2003 at 08:33:33PM -0700, Michael Rickert wrote:

When $DISPLAY points to some display over the
network, and you run app using
SDL, the app just hangs (forever) without doing
anything.

I started to debug this, and I found that the
hanging function is in SDL_Init().

When you call SDL_Init(SDL_INIT_VIDEO), SDL’s
X11_VideoInit() is called, which
then calls X11_GetVideoModes()
(src/video/x11/SDL_x11modes.c).

X11_GetVideoModes() calls
XF86VidModeGetAllModeLines() which is the one
which hangs (forever).

I’m fairly new to linux and everything, but my two
cents:

Unless this function (XF86…) is supposed to freeze
(which I doubt) then this would be a bug in XFree86
(unless this function is part of the SDL code, which
I’m doubting as well…due to the different naming
format). I’d suggest upgrading: www.xfree86.org (4.3.0
is out).

If the problem continues… a bug report for xfree
would probably be in order.

For some reason, SDL includes a portion of the XFree86 sources (see the tree
under src/XFree86). The function referenced in the email lies in those
sources.

Is it possible there is some sort of version mismatch between the included
sources and the newer XFree86 releases?

–Howdy=============================
Howdy Pierce
Managing Partner
Cardinal Peak, LLC

email: howdy -at- cardinalpeak.com
work: (303) 665-3962

Hi SDL List

In August last year Pasi K?rkk?inen posted a message to this list on
the same subject:

On Linux (and perhaps on other platforms using the X Window System)
SDL_init() never returns when the environment variable DISPLAY is set
to a remote computer.

Last year there did not seem to be a solution to his problem, although
some people thought it had to do with an XFree86 function call.

I have exactly the same problem now and I wonder if a fix has been
found in the meantime. To reproduce the hang on x86 Linux (bash)
platforms running XFree86:

$ rlogin remote_computer
$ export DISPLAY=local_computer:0
$ any_SDL_app

I have confirmed this on Red Hat 7.1 (kernel 2.4.2, SDL 1.1.7, XFree86
4.0.3, glibc 2.96) and Fedora Core 1 (kernel 2.4.22, SDL 1.2.5,
XFree86 4.3.0, glibc 2.3.2).

My little test program is:

#include “SDL.h”
#include <stdio.h>
int main(void)
{
printf(“Initializing SDL…\n”);
fflush(stdout);
if (SDL_Init(SDL_INIT_VIDEO) == -1)
{
printf(“SDL returned an error code.\n”);
fflush(stdout);
printf(“Could not initialize SDL: %s.\n”, SDL_GetError());
return(-1);
}
printf(“SDL initialized OK.\n”);
printf(“Quiting SDL…\n”);
fflush(stdout);
SDL_Quit();
printf(“SDL quit OK.\n”);
return(0);
}

which was compiled with:

gcc -g -O2 -Wall -W -I/usr/include/SDL -L/usr/lib -lSDL -lpthread
SDL-example-1-1.c -o SDL-example-1-1

I also tried SDL_INIT_VIDEO|SDL_NO_PARACHUTE as the argument to
SDL_init().

I have searched the FAQ and the rest of the libsdl.org site, as well
as the googling the Net. Not a sausage.

Any ideas?

Thanks,
Johann–
Johann Schoonees Imaging & Sensing Team
Industrial Research Limited, PO Box 2225, Auckland, New Zealand
Phone +64 9 9203679 Fax +64 9 3028106 http://www.is.irl.cri.nz/
Camwire’s home: http://kauri.auck.irl.cri.nz/~johanns/camwire/

Johann Schoonees wrote:

Hi SDL List

In August last year Pasi K?rkk?inen posted a message to this list on
the same subject:

On Linux (and perhaps on other platforms using the X Window System)
SDL_init() never returns when the environment variable DISPLAY is set
to a remote computer.

Last year there did not seem to be a solution to his problem, although
some people thought it had to do with an XFree86 function call.

I have exactly the same problem now and I wonder if a fix has been
found in the meantime. To reproduce the hang on x86 Linux (bash)
platforms running XFree86:

$ rlogin remote_computer
$ export DISPLAY=local_computer:0
$ any_SDL_app

I have confirmed this on Red Hat 7.1 (kernel 2.4.2, SDL 1.1.7, XFree86
4.0.3, glibc 2.96) and Fedora Core 1 (kernel 2.4.22, SDL 1.2.5,
XFree86 4.3.0, glibc 2.3.2).

It works here (and has always worked, AFAIR).
Are you able to run other (non SDL) program through X that way (for
example, xclock which is present on virtually any unix nowadays) ?
If you can run other programs, could you do a strace of it to see where
it stops ?

Stephane

Stephane Marchesin wrote:

Johann Schoonees wrote:

On Linux (and perhaps on other platforms using the X Window System)
SDL_init() never returns when the environment variable DISPLAY is set
to a remote computer.

Last year there did not seem to be a solution to his problem, although
some people thought it had to do with an XFree86 function call.

I have exactly the same problem now and I wonder if a fix has been
found in the meantime. To reproduce the hang on x86 Linux (bash)
platforms running XFree86:

$ rlogin remote_computer
$ export DISPLAY=local_computer:0
$ any_SDL_app

I have confirmed this on Red Hat 7.1 (kernel 2.4.2, SDL 1.1.7, XFree86
4.0.3, glibc 2.96) and Fedora Core 1 (kernel 2.4.22, SDL 1.2.5,
XFree86 4.3.0, glibc 2.3.2).

It works here (and has always worked, AFAIR).
Are you able to run other (non SDL) program through X that way (for
example, xclock which is present on virtually any unix nowadays) ?
If you can run other programs, could you do a strace of it to see where
it stops ?

Thanks for looking at this, Stephane.

I’ve been running X programs remotely for years. :slight_smile: xclock too.

There is no problem when I run SDL apps on my local computer (DISPLAY
not set) or locally on any other computer. When I run remotely
without setting DISPLAY, SDL_Init() returns -1 as expected with an
SDL_GetError() message “No available video device.”

Here is a trace (slightly edited):

$ rlogin remote_computer
$ export DISPLAY=local_computer:0
$ xclock &
$ gcc -g -O2 -Wall -W -I/usr/include/SDL -L/usr/lib -lSDL -lpthread
SDL-example-1-1.c -o SDL-example-1-1
$ gdb ./SDL-example-1-1
GNU gdb Red Hat Linux (5.3.90-0.20030710.41rh)
This GDB was configured as “i386-redhat-linux-gnu”…Using host
libthread_db library “/lib/tls/libthread_db.so.1”.
(gdb) run
Starting program: /home/me/SDL-example-1-1
[Thread debugging using libthread_db enabled]
[New Thread 1076908160 (LWP 16505)]
Initializing SDL.

[It hangs here. I had to interrupt with Control-C.]

Program received signal SIGINT, Interrupt.
[Switching to Thread 1076908160 (LWP 16505)]
0x40000c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0 0x40000c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x4018e2ad in ___newselect_nocancel () from /lib/tls/libc.so.6
#2 0x4024aae2 in _XPollfdCacheDel () from /usr/X11R6/lib/libX11.so.6
#3 0x4024ba51 in _XRead () from /usr/X11R6/lib/libX11.so.6
#4 0x4007022a in SDL_XF86VidModeGetAllModeLines ()
from /usr/lib/libSDL-1.2.so.0
#5 0x4006a853 in X11_GetVideoModes () from /usr/lib/libSDL-1.2.so.0
#6 0x4006c44e in X11_CheckMouseMode () from /usr/lib/libSDL-1.2.so.0
#7 0x40061121 in SDL_VideoInit () from /usr/lib/libSDL-1.2.so.0
#8 0x4003db56 in SDL_InitSubSystem () from /usr/lib/libSDL-1.2.so.0
#9 0x4003db97 in SDL_Init () from /usr/lib/libSDL-1.2.so.0
#10 0x0804860a in main () at SDL-example-1-1.c:9

The above is similar to previous reports:
http://www.libsdl.org/pipermail/sdl/2000-April/026564.html
http://archives.seul.org/pygame/users/Jan-2002/msg00002.html
http://lists.debian.org/debian-x/2003/08/msg00325.html

If any of them found a solution they didn’t publish it where I could
find it.

There is nothing interesting in /var/log/messages.

We have standard installations of Red Hat distributions on standard
desktop PC hardware. We don’t compile our own kernels or anything
like that. The version of SDL is as shipped by Red Hat. We haven’t
touched libc or libX11.

Johann–
Johann Schoonees Imaging & Sensing Team
Industrial Research Limited, PO Box 2225, Auckland, New Zealand
Phone +64 9 9203679 Fax +64 9 3028106 http://www.is.irl.cri.nz/
Camwire’s home: http://kauri.auck.irl.cri.nz/~johanns/camwire/

Johann Schoonees wrote:

Stephane Marchesin wrote:

Johann Schoonees wrote:

On Linux (and perhaps on other platforms using the X Window System)
SDL_init() never returns when the environment variable DISPLAY is
set to a remote computer.

Last year there did not seem to be a solution to his problem,
although some people thought it had to do with an XFree86 function
call.

I have exactly the same problem now and I wonder if a fix has been
found in the meantime. To reproduce the hang on x86 Linux (bash)
platforms running XFree86:

$ rlogin remote_computer
$ export DISPLAY=local_computer:0
$ any_SDL_app

I have confirmed this on Red Hat 7.1 (kernel 2.4.2, SDL 1.1.7,
XFree86 4.0.3, glibc 2.96) and Fedora Core 1 (kernel 2.4.22, SDL
1.2.5, XFree86 4.3.0, glibc 2.3.2).

It works here (and has always worked, AFAIR).
Are you able to run other (non SDL) program through X that way (for
example, xclock which is present on virtually any unix nowadays) ?
If you can run other programs, could you do a strace of it to see
where it stops ?

Thanks for looking at this, Stephane.

I’ve been running X programs remotely for years. :slight_smile: xclock too.

There is no problem when I run SDL apps on my local computer (DISPLAY
not set) or locally on any other computer. When I run remotely
without setting DISPLAY, SDL_Init() returns -1 as expected with an
SDL_GetError() message “No available video device.”

Here is a trace (slightly edited):

$ rlogin remote_computer
$ export DISPLAY=local_computer:0
$ xclock &
$ gcc -g -O2 -Wall -W -I/usr/include/SDL -L/usr/lib -lSDL -lpthread
SDL-example-1-1.c -o SDL-example-1-1
$ gdb ./SDL-example-1-1
GNU gdb Red Hat Linux (5.3.90-0.20030710.41rh)
This GDB was configured as “i386-redhat-linux-gnu”…Using host
libthread_db library “/lib/tls/libthread_db.so.1”.
(gdb) run
Starting program: /home/me/SDL-example-1-1
[Thread debugging using libthread_db enabled]
[New Thread 1076908160 (LWP 16505)]
Initializing SDL.

[It hangs here. I had to interrupt with Control-C.]

Program received signal SIGINT, Interrupt.
[Switching to Thread 1076908160 (LWP 16505)]
0x40000c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0 0x40000c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x4018e2ad in ___newselect_nocancel () from /lib/tls/libc.so.6
#2 0x4024aae2 in _XPollfdCacheDel () from /usr/X11R6/lib/libX11.so.6
#3 0x4024ba51 in _XRead () from /usr/X11R6/lib/libX11.so.6
#4 0x4007022a in SDL_XF86VidModeGetAllModeLines ()
from /usr/lib/libSDL-1.2.so.0
#5 0x4006a853 in X11_GetVideoModes () from /usr/lib/libSDL-1.2.so.0
#6 0x4006c44e in X11_CheckMouseMode () from /usr/lib/libSDL-1.2.so.0
#7 0x40061121 in SDL_VideoInit () from /usr/lib/libSDL-1.2.so.0
#8 0x4003db56 in SDL_InitSubSystem () from /usr/lib/libSDL-1.2.so.0
#9 0x4003db97 in SDL_Init () from /usr/lib/libSDL-1.2.so.0
#10 0x0804860a in main () at SDL-example-1-1.c:9

The above is similar to previous reports:
http://www.libsdl.org/pipermail/sdl/2000-April/026564.html
http://archives.seul.org/pygame/users/Jan-2002/msg00002.html
http://lists.debian.org/debian-x/2003/08/msg00325.html

If any of them found a solution they didn’t publish it where I could
find it.

There is nothing interesting in /var/log/messages.

We have standard installations of Red Hat distributions on standard
desktop PC hardware. We don’t compile our own kernels or anything
like that. The version of SDL is as shipped by Red Hat. We haven’t
touched libc or libX11.

Johann

That might be the problem. I just looked at a fedora sdl package
(SDL-1.2.5-9), and they happen to patch exactly that part. Would you
mind removing the SDL rpm and compiling SDL from libsdl.org ? I think
that could solve your problem.

Stephane

Stephane Marchesin wrote:

Johann Schoonees wrote:

I’ve been running X programs remotely for years. :slight_smile: xclock too.

There is no problem when I run SDL apps on my local computer (DISPLAY
not set) or locally on any other computer. When I run remotely
without setting DISPLAY, SDL_Init() returns -1 as expected with an
SDL_GetError() message “No available video device.”

Here is a trace (slightly edited):

Hmm, I was talking about a strace. Here is how to get one :
strace myprogram
That might give quite a big log, if it is too big, mail it directly to me.

Ah. The strace is quite big so I’ll email it to you separately.

Could this problem have something to do with the fact that my home
directory (containing files like .Xauthority) is nfs mounted on a
third server?

Johann–
Johann Schoonees Imaging & Sensing Team
Industrial Research Limited, PO Box 2225, Auckland, New Zealand
Phone +64 9 9203679 Fax +64 9 3028106 http://www.is.irl.cri.nz/
Camwire’s home: http://kauri.auck.irl.cri.nz/~johanns/camwire/

Stephane Marchesin wrote:

Johann Schoonees wrote:

Remember I am having this problem on Fedora Core 1. However, I can
reproduce exactly the same behaviour on Red Hat 7.1 and an older
version of SDL, so the problem might still be the same on Core 2.

The xdpyinfo output follows. My local machine is called prion (RH7.1)
and the remote X client is on towai (FC1).

prion$ rlogin towai
Last login: Mon May 31 11:32:23 from prion
towai$ export DISPLAY=prion:0
towai$ xdpyinfo
name of display: prion:0.0
version number: 11.0
vendor string: The XFree86 Project, Inc
vendor release number: 4003
XFree86 version: 4.0.3

Does this only happen when you use XFree86 4.0.3 ?
In the SDL source, there’s a warning about XFree86 videomode being
broken for < 4.0.2. Maybe there are also problems with 4.0.3 ?

Anyway judging by the logs you sent me, the troublemaker is prion (or
rather, RH7.1). So I should probably rather look at redhat 7.1 (FC2 has
no problems with this, btw).

Voila! Now why did I not think of that? It works fine between two
Core 1 machines.

It’s still a bit of a pain for me because most of our machines still
run RH7.1 and only the two lab computers have Core 1, but this is
obviously a problem that will fade away as newer operating systems get
installed.

Thanks for your patient input, Stephane. It means that we will now
probably continue our development with SDL.

Johann–
Johann Schoonees Imaging & Sensing Team
Industrial Research Limited, PO Box 2225, Auckland, New Zealand
Phone +64 9 9203679 Fax +64 9 3028106 http://www.is.irl.cri.nz/
Camwire’s home: http://kauri.auck.irl.cri.nz/~johanns/camwire/

Johann Schoonees wrote:

Does this only happen when you use XFree86 4.0.3 ?
In the SDL source, there’s a warning about XFree86 videomode being
broken for < 4.0.2. Maybe there are also problems with 4.0.3 ?

Anyway judging by the logs you sent me, the troublemaker is prion (or
rather, RH7.1). So I should probably rather look at redhat 7.1 (FC2
has no problems with this, btw).

Voila! Now why did I not think of that? It works fine between two
Core 1 machines.

It’s still a bit of a pain for me because most of our machines still
run RH7.1 and only the two lab computers have Core 1, but this is
obviously a problem that will fade away as newer operating systems get
installed.

Thanks for your patient input, Stephane. It means that we will now
probably continue our development with SDL.

Johann

No problem, but I still would like to know if this happens with 4.0.3
only :slight_smile:
That way these could be special cased in future SDL versions.

Stephane