Seg fault problem using threads

Timothy_Hanlon · October 1, 2003, 12:49am

Hi,

I’m using a thread in my program for network stuff while the main loop of my game continues. Everything works fine, and I can get a message from the server and printf it, but if I try to do anything else, like just incrementing a variable for the amount of players I get a seg fault (Fatal signal: Segmentation Fault (SDL Parachute Deployed)).

I tried making the net thread call another function to do what was making it seg fault but that has not fixed it.

Is there any “scope” of these threads I should know about? Do I need to run the rest of my program as a thread also, or can I just have the 1 thread for network while the normal game loop continues? I’m really baffled as to why I can do things like printf but not a simple variable++!!!

Thanks heaps in advance,
Tim Hanlon

Bob_Pendleton · October 1, 2003, 1:30pm

Hi,

I’m using a thread in my program for network stuff while the main loop
of my game continues. Everything works fine, and I can get a message
from the server and printf it, but if I try to do anything else, like
just incrementing a variable for the amount of players I get a seg
fault (Fatal signal: Segmentation Fault (SDL Parachute Deployed)).

This sounds a lot like memory corruption. Is your network code writing
an arbitrary amount of information to a fixed size buffer?

I tried making the net thread call another function to do what was
making it seg fault but that has not fixed it.

Is there any “scope” of these threads I should know about?

Only that you have to make sure that two threads don’t use the same
variables at the same time. For example, if two threads try to assign
the same variable at the same time the results are anyones guess.

Do I need to run the rest of my program as a thread also, or can I
just have the 1 thread for network while the normal game loop
continues?

The rest of you program is already running in another thread. You start
out with one thread, the main thread, and then start others.

I’m really baffled as to why I can do things like printf but not a
simple variable++!!!

Yeah that can be pretty painful. Trying stubbing out routines until the
program works and then put the code back in until it fails. That will
tell you where the problem is.

Thanks heaps in advance,
Tim Hanlon

Bob PendletonOn Wed, 2003-10-01 at 02:48, Timothy Hanlon wrote:

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl
–
±----------------------------------+

Bob Pendleton: independent writer +
and programmer. +
email: Bob at Pendleton.com +
web: www.GameProgrammer.com +
±----------------------------------+

Loren_Osborn · October 1, 2003, 2:04pm

— Bob Pendleton wrote:

Hi,

I’m using a thread in my program for network stuff
while the main loop
of my game continues. Everything works fine, and I
can get a message
from the server and printf it, but if I try to do
anything else, like
just incrementing a variable for the amount of
players I get a seg
fault (Fatal signal: Segmentation Fault (SDL
Parachute Deployed)).

This sounds a lot like memory corruption. Is your
network code writing
an arbitrary amount of information to a fixed size
buffer?

This is, of course possible, but I suspect something
else.

I tried making the net thread call another
function to do what was
making it seg fault but that has not fixed it.

Is there any “scope” of these threads I should
know about?

Only that you have to make sure that two threads
don’t use the same
variables at the same time. For example, if two
threads try to assign
the same variable at the same time the results are
anyones guess.

True, but the mechinism usually used for doing this is
called a “Mutex” (shot for mutually exclusive
variable). You don’t store any DATA in a mutex, you
simply use it to broker authority to write to some
OTHER variable or data structure. So, if you’re going
to increment your # of players from your network
thread, the

network thread would:
“lock” the mutex prior to incrementing
do the increment, then
“unlock” the mutex

Then before reading, or modifying the value, the main
thread would:
“lock” the mutex prior to reading
use the variable,
but not “unlock” the mutex until the value is safe
to modify again.

If one thread is trying to lock a mutex that’s already
locked, it will “block” the thread until it’s safe for
it to aquire the lock.

Try hard to not assume that operations are atomic.
Your simple variable++ example involves at least:
read variable
increment variable
write variable

What happened if the main thread tried to read the
variable part-way through the write operation? Also I
suggest you consider making any variables accessed my
multiple threads “volitile”. That should ensure they
are flushed out to memory after every operation.

I’m really baffled as to why I can do things like
printf but not a
simple variable++!!!

Yeah that can be pretty painful. Trying stubbing out
routines until the
program works and then put the code back in until it
fails. That will
tell you where the problem is.

While this is a quite reasonable suggestion with
single-threaded code, it is quite foolhearty for
multi-threaded code. Multi-threaded code is very
sensitive to timing issues, and, if not written
properly, will often work fine one machine, but not on
a seemingly identical CPU. CPU lot number, or even
differences in cooling can add enough variation to
make poorly written multi-threaded code misbehave.

Be sure to be vigilant about using mutexes, even where
it seems a bit trivial. (Never assume an operation is
atomic… ESPECIALLY any sort of WRITE operation.)

Hope this helps,

-Loren> On Wed, 2003-10-01 at 02:48, Timothy Hanlon wrote:

Do you Yahoo!?
The New Yahoo! Shopping - with improved product search

Timothy_Hanlon · October 1, 2003, 4:59pm

Hi again,

I’ve added an SDL_mutex *mutex; globally, and in my init function I use mutex = SDL_CreateMutex();

Now in my network thread when I do a SDL_mutexP(mutex); I get the same seg fault, without even trying to increment the variable.

None of the examples I’ve looked at seem to assign a variable to the mutex in any way - they just create it and then use it - is there something I’ve overlooked? Do I need to pass the mutex to the network thread? How do I do this when I’m already using the one variable I can pass in to the thread for something else?

Thanks again,
Tim

Loren_Osborn · October 1, 2003, 9:37pm

You want to make sure that the mutex is, of course, created before you
start your network thread… Besides that, I’d really have to see the
code to give any further suggestions.

-LorenOn Wed, 2003-10-01 at 16:58, Timothy Hanlon wrote:

Hi again,

I’ve added an SDL_mutex *mutex; globally, and in my init function I use mutex = SDL_CreateMutex();

Now in my network thread when I do a SDL_mutexP(mutex); I get the same seg fault, without even trying to increment the variable.

None of the examples I’ve looked at seem to assign a variable to the mutex in any way - they just create it and then use it - is there something I’ve overlooked? Do I need to pass the mutex to the network thread? How do I do this when I’m already using the one variable I can pass in to the thread for something else?

Thanks again,
Tim

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Bob_Pendleton · October 2, 2003, 1:16pm

— Bob Pendleton <@Bob_Pendleton> wrote:

Hi,

I’m using a thread in my program for network stuff
while the main loop
of my game continues. Everything works fine, and I
can get a message
from the server and printf it, but if I try to do
anything else, like
just incrementing a variable for the amount of
players I get a seg
fault (Fatal signal: Segmentation Fault (SDL
Parachute Deployed)).

This sounds a lot like memory corruption. Is your
network code writing
an arbitrary amount of information to a fixed size
buffer?

This is, of course possible, but I suspect something
else.

I tried making the net thread call another
function to do what was
making it seg fault but that has not fixed it.

Is there any “scope” of these threads I should
know about?

Only that you have to make sure that two threads
don’t use the same
variables at the same time. For example, if two
threads try to assign
the same variable at the same time the results are
anyones guess.

True, but the mechinism usually used for doing this is
called a “Mutex” (shot for mutually exclusive
variable). You don’t store any DATA in a mutex, you
simply use it to broker authority to write to some
OTHER variable or data structure. So, if you’re going
to increment your # of players from your network
thread, the

network thread would:
“lock” the mutex prior to incrementing
do the increment, then
“unlock” the mutex

Then before reading, or modifying the value, the main
thread would:
“lock” the mutex prior to reading
use the variable,
but not “unlock” the mutex until the value is safe
to modify again.

If one thread is trying to lock a mutex that’s already
locked, it will “block” the thread until it’s safe for
it to aquire the lock.

Try hard to not assume that operations are atomic.
Your simple variable++ example involves at least:
read variable
increment variable
write variable

What happened if the main thread tried to read the
variable part-way through the write operation? Also I
suggest you consider making any variables accessed my
multiple threads “volitile”. That should ensure they
are flushed out to memory after every operation.

I’m really baffled as to why I can do things like
printf but not a
simple variable++!!!

Yeah that can be pretty painful. Trying stubbing out
routines until the
program works and then put the code back in until it
fails. That will
tell you where the problem is.

While this is a quite reasonable suggestion with
single-threaded code, it is quite foolhearty for
multi-threaded code.

Interesting observation. As you might know I recently wrote a threaded
IO package for SDL (http://gameprogrammer.com/net2/net2-0.html) so I
have direct knowledge of the challenges being faced by the person who
asked the original question. So, my answers are based on solving the
same problem with the same tools.

I would like to hear your suggestions on how to debug multi-threaded
coded. The foolhardy technique I suggested has been working for me for
nearly 30 years. I’ve used it many times since I wrote my first
multi-threaded code in FORTRAN on a UNIVAC 1108 (an ancient language on
a forgotten multi-processor system). So, I would really like the benefit
of your experience and wisdom in helping me learn better methods for
solving these problems.

Multi-threaded code is very
sensitive to timing issues, and, if not written
properly, will often work fine one machine, but not on
a seemingly identical CPU. CPU lot number, or even
differences in cooling can add enough variation to
make poorly written multi-threaded code misbehave.

Be sure to be vigilant about using mutexes, even where
it seems a bit trivial. (Never assume an operation is
atomic… ESPECIALLY any sort of WRITE operation.)

Except, of course, that you can and must assume that mutex operations,
and the underlying machine instructions that they are based on are
atomic.

Hope this helps,

-Loren

	Bob PendletonOn Wed, 2003-10-01 at 16:03, Loren Osborn wrote:

On Wed, 2003-10-01 at 02:48, Timothy Hanlon wrote:

Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl
–
±----------------------------------+

Bob Pendleton: independent writer +
and programmer. +
email: Bob at Pendleton.com +
web: www.GameProgrammer.com +
±----------------------------------+

Bob_Pendleton · October 2, 2003, 1:27pm

Hi again,

I’ve added an SDL_mutex *mutex; globally, and in my init function I
use mutex = SDL_CreateMutex();

You have to make sure that this happens before you start the thread and
you need to make sure that the mutex is not null. In some error
conditions you might not be able to allocate a mutex.

Now in my network thread when I do a SDL_mutexP(mutex); I get the same
seg fault, without even trying to increment the variable.

The mutex is either null or you have corrupted the memory allocated to
the mutex.

None of the examples I’ve looked at seem to assign a variable to the
mutex in any way - they just create it and then use it - is there
something I’ve overlooked?

No that is how it works.

Do I need to pass the mutex to the network thread? How do I do this
when I’m already using the one variable I can pass in to the thread
for something else?

No, both threads have access to all the same global variables. The IO
thread and the main thread share all global variables. Local variables
in function are local to a thread.

One thing to worry about, SDL_net is not thread safe. If both threads
use SDL_net you can get some very weird bugs. OTOH, the standard IO
libraries are usually thread safe (depends on the OS). That leads to the
odd situation where code will work when put in trace code but not work
when you remove it.

I wrote a library to do the same thing you are doing a little while ago.
Feel free to look at it an see how I solved the same problems. it is at
http://gameprogrammer.com/net2/net2-0.html

One thing I have found to be very helpful in building threaded code is
to build it a little bit at the time. Build a piece of dummy code for
the thread that just generates fake results. When that works then you
can start adding in real code. If you develop and test in small hunks
you will know what changes revealed a bug. It may be a bug that was
already there, but you know what revealed it even if you don’t know what
caused it.

	Bob PendletonOn Wed, 2003-10-01 at 18:58, Timothy Hanlon wrote:

Thanks again,
Tim

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl
–
±----------------------------------+

Bob Pendleton: independent writer +
and programmer. +
email: Bob at Pendleton.com +
web: www.GameProgrammer.com +
±----------------------------------+

David_Olofson · October 2, 2003, 4:22pm

[…]

One thing I have found to be very helpful in building threaded code
is to build it a little bit at the time. Build a piece of dummy
code for the thread that just generates fake results. When that
works then you can start adding in real code. If you develop and
test in small hunks you will know what changes revealed a bug. It
may be a bug that was already there, but you know what revealed it
even if you don’t know what caused it.

Amen! In fact, I’d recommend that approach for all sorts of code. The
related “test modules outside the system before throwing them in”
strategy is also a very good one.

Problem is that if you do this with games, there’s a big risk you’ll
end up “testing” more than you code, once you come to the “sort of
playable” stage…

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
— http://olofson.net — http://www.reologica.se —On Thursday 02 October 2003 22.26, Bob Pendleton wrote:

Timothy_Hanlon · October 2, 2003, 7:41pm

Hi again,

I’ve now tried everything…declaring the mutex globally, creating it before the thread is created, and now passing in the mutex to the network thread when I create it.

I now have (global)

SDL_mutex *lock_mutex;
SDL_Thread *net_thread=NULL;

then in my main()

lock_mutex = SDL_CreateMutex();
net_thread = SDL_CreateThread(net_thread_main,lock_mutex);

then in net_thread_main(void *data)

SDL_mutex *mutex = (SDL_mutex *) data;

and

Clemens_Kirchgattere · October 2, 2003, 10:16pm

“Timothy Hanlon” wrote:

SDL_mutex *lock_mutex;
SDL_Thread *net_thread=NULL;

then in my main()

lock_mutex = SDL_CreateMutex();
net_thread = SDL_CreateThread(net_thread_main,lock_mutex);

then in net_thread_main(void *data)

SDL_mutex *mutex = (SDL_mutex *) data;

why do you pass the mutex to the thread? as you said the mutex is GLOBAL
so the thread has access to it anyway. though, that is not the problem
of the segfault, just odd.

clemens

Loren_Osborn · October 3, 2003, 8:34pm

— Bob Pendleton wrote:

I’m really baffled as to why I can do things like
printf but not a
simple variable++!!!

Yeah that can be pretty painful. Trying stubbing out
routines until the
program works and then put the code back in until it
fails. That will
tell you where the problem is.

While this is a quite reasonable suggestion with
single-threaded code, it is quite foolhearty for
multi-threaded code.

Interesting observation. As you might know I recently wrote a threaded
IO package for SDL (http://gameprogrammer.com/net2/net2-0.html) so I
have direct knowledge of the challenges being faced by the person who
asked the original question. So, my answers are based on solving the
same problem with the same tools.

I would like to hear your suggestions on how to debug multi-threaded
coded. The foolhardy technique I suggested has been working for me for
nearly 30 years. I’ve used it many times since I wrote my first
multi-threaded code in FORTRAN on a UNIVAC 1108 (an ancient language on
a forgotten multi-processor system). So, I would really like the benefit
of your experience and wisdom in helping me learn better methods for
solving these problems.

Well, I believe this is primarily a misunderstanding on my behalf.
There exist a (hopefully) small populus of programmers (or at least
that’s what they call themselves) that seem to put very little thought
into what they’re doing, or how anything works, and program almost
purely by trial and error. Dispite all our years of training on how a
system should be designed first, then implemented, and our conditioning
that doing otherwise cannot possibly work, these enigmas do exist. I
have seen them work. I suspect these people were the examples sighted
when they proposed the “Million monkeys with a million typewritiers for
a million years would produce the complete works of Shakespeare”
hypothesis. I think your suggestion of “stubbing out routines until the
program works and then …” sounded a bit to reminiscent of this
mantality, and I must have misinterperted it as such. My intent was not
that a code-test-code-test cycle was a bad idea, but rather that without
a solid understanding of what data is being shared by the threads, why
this data needs to be mutex protected, and how to protect it, that such
a cycle is insufficient (in the miraculous circumstance that you do get
something working without any forethought) to be confident that it will
work on any machine besides your own.

I remember reading an appropriate cartoon caption on a wall in college:
“You start coding. I’ll run upstairs and see what they want.”

Multi-threaded code is very
sensitive to timing issues, and, if not written
properly, will often work fine one machine, but not on
a seemingly identical CPU. CPU lot number, or even
differences in cooling can add enough variation to
make poorly written multi-threaded code misbehave.

Be sure to be vigilant about using mutexes, even where
it seems a bit trivial. (Never assume an operation is
atomic… ESPECIALLY any sort of WRITE operation.)

Except, of course, that you can and must assume that mutex operations,
and the underlying machine instructions that they are based on are
atomic.

Quite true, except that since this is the supposed to be the DEFINITION
of a mutex, I didn’t feel this was an assumption.

My applogies for any misunderstandings.

Best regards,

-LorenOn Thu, 2003-10-02 at 13:14, Bob Pendleton wrote:

On Wed, 2003-10-01 at 16:03, Loren Osborn wrote:

On Wed, 2003-10-01 at 02:48, Timothy Hanlon wrote: