[Theory] Is the closure of Stats Server/Ranked Mode cause of Linux client crashes?


(REA987) #1

Hey,

As few you might aware, Linux client of ETQW suffers from weird crash/freeze/screen lock/segfault issues. As the game is decade old and there hasn’t been an update for 9 years, it has a tendency to crash on modern versions of both Windows, Mac and Linux anyway. But the Linux crash that I will try to explain is a bit different.

In 2012, Linux version started to suffer from missing characters issue which is caused by a glibc update released around that time. It is also claimed that the libgcc library shipped with the game causes crashes which can be prevented by removing/renaming that library; libgcc_s.so.1.

http://forums.warchest.com/showthread.php/32089-ETQW-oddities-with-glibc-2-15-FIX
http://forums.warchest.com/showthread.php/32089-ETQW-oddities-with-glibc-2-15-FIX?p=396658&viewfull=1#post396658
https://bugs.archlinux.org/task/28093
https://bbs.archlinux.org/viewtopic.php?id=133922

As you_is_me recently compiled an up-to-date libc with referred arguments, the missing letter issue can be fixed but the segfault still happens in some cases.

http://forums.warchest.com/showthread.php/32089-ETQW-oddities-with-glibc-2-15-FIX?p=570188&viewfull=1#post570188
http://forums.warchest.com/showthread.php/32089-ETQW-oddities-with-glibc-2-15-FIX?p=570209&viewfull=1#post570209

I believe, libgcc related crashes were connected with the libc release at the time aren’t exact cause of the crashes that Linux clients suffer nowadays. Because, as I recently tried several Ubuntu version prior to 2012 (11.10, 10.04.4, 10.04, 9.04, 8.04) on an old laptop with Nvidia proprietary graphics from 2008, the segfault is still there.

Here are the symptoms that I observed and the theory that I came up with;

  • The game doesn’t crash on Linux when you connect a server immediately after launching the game.
  • The game doesn’t crash on Linux when you keep connecting and disconnecting for testing purposes.
  • The game does crash/freeze on Linux if you wait in server menu for some time (5 minutes or more), then try to connect a server.
  • The game does crash/freeze on Linux while connecting to another server after playing a long game.

Now, knowing that the game crashes/freezes even on distros from 2008 today, I believe that is a server side problem. I theorize that the closure of stats server and/or ranked mode causes the issue. It is obvious that the game attempts to do something if you wait in server menu for a long time. Considering that stats server and ranked mode were shut down around 2010, that might provide some clues. I believe, the game tries to connect stats server in order to retrieve up-to-date data; when its attempts return an error since there is no stats server anymore, it crashes/freezes/segfaults when the player connects a game server. It doesn’t happen if the player immediately connects to a server since at that point the game still retries to gather stats.

I haven’t tested that theory; if someone points an easy to use to use tool (has GUI) which allows tracking connection activities of a certain program on Linux, I may be able to prove or disprove it. Biggest problem of that theory is the fact that Windows and Mac versions do not crash from that issue. I would argue that those versions were developed/ported by Splash Damage and Aspyr, so they considered such possibilities during the development. But the Linux version was unofficially ported by former id Software employee, Timothee Besset who might be not aware of such possibility.

So, here it is. After dealing with that obnoxious “freeze during connection” issue for a year, that’s the best explanation that I came up with. What do you think and how can it be prevented?


(edxot) #2

I would try running the game in a single core (disabling the multi core support - there’s a command for that).
Synchronhzation in multi-cores is not as simple as multi-threaded.


(you_is_me) #3

[QUOTE=edxot;573326]I would try running the game in a single core (disabling the multi core support - there’s a command for that).
Synchronhzation in multi-cores is not as simple as multi-threaded.[/QUOTE]

I also thought it could be a Race Condition problem (Singleton done wrong).
Using r_usethreadrenderer 1 and the etqw.x86 (not the r-hread.x86 one) executable and add maxcpus=1 to the kernel command line (so no other CPU ever gets online) does not help. It still happens in a random way.

By the way: Punkbuster is not the problem. I completly removed it (it doesn’t even show the Punkbuster checkbox in the top left corner in the server browser) and still get the broken_pipe.

I’ve tried to trace the problem (and also find out what ETQW does dlopen) and it may (?) could be related to some .so files. Here is a small output:

open("/usr/lib32/libGLX_indirect.so.0", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libexpat.so.1", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libxcb-dri3.so.0", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libxcb-xfixes.so.0", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libxcb-present.so.0", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libxcb-sync.so.1", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libxshmfence.so.1", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libglapi.so.0", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libXdamage.so.1", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libXfixes.so.3", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libX11-xcb.so.1", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libxcb-glx.so.0", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libxcb-dri2.so.0", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libXxf86vm.so.1", O_RDONLY|O_CLOEXEC) = 28
open("/usr/lib32/libdrm.so.2", O_RDONLY|O_CLOEXEC) = 28
— SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} —
open("/home/private/.etqwcl/sdnet/you_is_me/password.dat", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 26
open("/home/private/.etqwcl/sdnet/you_is_me/user.dat", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 26
open("/home/private/.etqwcl/sdnet/you_is_me/base.dict", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 26
open("/home/private/.etqwcl/sdnet/you_is_me/base/profile.cfg", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 26
open("/home/private/.etqwcl/sdnet/you_is_me/base/bindings.cfg", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 26
+++ exited with 0 +++

So libdrm.so.2 maybe is the problem?

It could be. libdrm does work with Mesa (Quote “LIBDRM is the cross-driver middleware which allows user-space applications (such as Mesa and 2D drivers) to communicate with the Kernel by the means of the DRI protocol.” And we know that Mesa already had problems with QW.


(REA987) #4

[QUOTE=you_is_me;573394]So libdrm.so.2 maybe is the problem?

It could be. libdrm does work with Mesa (Quote “LIBDRM is the cross-driver middleware which allows user-space applications (such as Mesa and 2D drivers) to communicate with the Kernel by the means of the DRI protocol.” And we know that Mesa already had problems with QW.[/QUOTE]

Well, that’s new. But the game does crash with both proprietary Nvidia drivers and open source (Mesa) Intel drivers. Also, as the game crashes even on Linux distros from 2008, that cannot be a client problem alone. There has to be a server related part of the issue. Can you please post the command so I can test it with my Nvidia system?


(you_is_me) #5

If it still crashes on your old system (without updating anything from it) then I think there can’t be done a lot.
I’ve used “strace -f -e open /opt/etqw/etqw-rthread.x86” for this (I recommend to force the output into a txt file for better reading though).


(edxot) #6

I think he is asking about this command:

[Dual/Multi-Core Tweaks]

r_useThreadedRenderer [0,1,2 ] - Added as of the 1.2 Patch, this option allows you to enable multithreading if you have a dual or multi-core CPU, and this can improve the performance of ET:QW. By default it’s disabled (set to 0), but you can set it to either 1 or 2, with a value of 1 locking the renderer to your in-game frames, while 2 allows it to run unlocked. The developers recommend a value of 2 for this variable if you wish to enable it. Note that you must enable it either by inserting it in your autoexec.cfg file, or by entering it in the console prior to the start of a game; it can’t be changed during a game.

NOTE: taken from this page: http://www.tweakguides.com/ETQW_9.html


I would also try to use this command because unhandled exceptions usually cause programs to end.

winExceptionHandler
1 enables the built-in exception handling, 0 disables it.


BTW, I am completely guessing here, never used linux with ETQW.


(edxot) #7

I could add this to the previous post, but it is somewhat unrelated, so decided to make a new one.

Short story:
More than 20 years ago, I was teaching short beginner curses about computers in some institution. As such, I had to maintain a fleet of classroom computers, so students could use them. In 1995 there was no internet as there is today, and so my main problems consisted of virus all over the place.
Virus, as they existed back then, were pieces of code that attached themselves to executable files or to drives boot sectors. Nobody would believe the amount of different species existed, unless they would go to the documentation of one of the 3 different anti-virus we used there.
Back then, I seen even weirder things, like virus that would do things like infect BIOS, and simulate the boot process from floppy (if ordered to do so), but starting the virus from the hard-disc first.

For most here, this is very old news (just boring),but there is a reason for it. Anyone believes that this technology has vanished ?
Really ???

I never heard of any anti-virus for Linux. So, just because their computers work fine (without prompting some message saying “stoned” or something else) most people believe they don’t need one.

Anyway, not trying to create a discussion over Windows/Linux here, but it really puzzles me why people decide to not thrust Microsoft (who has a lot to lose) and go thrust a lot other companies and developers that most of the time have nothing to lose.

Why use ETQW with linux ? It makes no sense at all.


(REA987) #8

Dear edxot,

We beg you; get a life.

Regards,
ETQW community


(edxot) #9

[QUOTE=REA987;573453]Dear edxot,

We beg you; get a life.

Regards,
ETQW community[/QUOTE]

Is that what people call you ? Or what you call yourself ?

Off-Topic:

The war on noobs is over, as stated in diverse news networks: “edxot is about to be defeated”. Meanwhile our reporters on the field intercepted this communication:

  • “Sir edxot SIR, we are completely surrounded”
  • “Excellent, it means we can attack in any direction”
    --------> Don’t forget to subscribe for more fake news.

(you_is_me) #10

[QUOTE=edxot;573326]I would try running the game in a single core (disabling the multi core support - there’s a command for that).
Synchronhzation in multi-cores is not as simple as multi-threaded.[/QUOTE]

So I tried it as far as it gets:
ETQW Config:
r_usesmp 0
r_usethreadedrenderer

Then added isolcpus=1 to my kernel line which means CPU1 will not be considered in any way for scheduling purposes. Only a process runs on this CPU when I say so (so there is no competition between processes for CPU time).
Used ip_tables so only QW has access to the Internet.
I started ETQW with highest Priorities possible: Niceness -20, ionice class RT (Realtime), Scheduler FiFo with highest priority and core affinity 0x2 (CPU1).

And still crash.

@REA987:
Did you try to capture packages with a programm like Wireshark?


(REA987) #11

[QUOTE=you_is_me;573483]
@REA987:
Did you try to capture packages with a programm like Wireshark?[/QUOTE]

Will try in the weekend. Also, libc that you recompiled would help making an AppImage of the game that’ll be immune to missing letter issue.


(edxot) #12

Just want to share a video before people start calling me fanboy


(REA987) #13

[QUOTE=you_is_me;573483]So I tried it as far as it gets:
ETQW Config:
r_usesmp 0
r_usethreadedrenderer[/QUOTE]

As far as I know,

r_useSMP

works for Quake 4 and Doom 3, not for ETQW which uses

r_useThreadedRenderer

instead. Am I wrong?