FPS leak while capping frame rate

Hi, I’m new here.
Every time I set FPS limit it shows less than should be. The more limit the more FPS i lose.
Without cap my program has average of 9300 FPS.

For example |
FPS_LIMIT = 150 gives about 128 stable FPS.
FPS_LIMIT = 500 gives about 345 FPS.
FPS_LIMIT = 2001 and more is bugged but it’s not point of this topic.
cmd

class properties:

Uint16 FPS_LIMIT = 150;
Uint16 iFrames = 0;
Uint32 iFrameLength = 0;
Uint32 frameStartTime = 0;

main loop:

void loop()
{
	SDL_Event wndEvent;

	while(true)
	{
		frameStartTime = SDL_GetTicks();

		if(SDL_PollEvent(&wndEvent) && wndEvent.type == SDL_QUIT)
			break;
			
		glClearColor(0, 0, 0, 0);
		glClear(GL_COLOR_BUFFER_BIT);
			
		SDL_GL_SwapWindow(wnd);

		processFps(true, true);
	}
}

show and cap fps function:

void processFps(bool showFps = true, bool fpsLimit = false)
{
	if(fpsLimit)
	{
		Uint16 expectedFrameLength = round(1000.0 / FPS_LIMIT);
		Uint32 lastFrameLength = SDL_GetTicks() - frameStartTime;
			
		if(expectedFrameLength > lastFrameLength)
			SDL_Delay(expectedFrameLength - lastFrameLength);
	}

	if(showFps)
	{
		iFrames++;
		iFrameLength += SDL_GetTicks() - frameStartTime;

		if(iFrameLength >= 1000.0) // if the frames reached 1 second of rendering time
		{
			cout << iFrames << " fps" << endl;

			iFrames = 0;
			iFrameLength = 0;
		}
	}
}

I really don’t understand what eats my fps. I tested different FPS calculation algorithm and the same problem occurs. The goal is to have the same FPS as set in FPS_LIMIT constant.

Thanks for any help!

Just quickly looking at your code I think you should do:

iFrameLength -= 1000;

and not

iFrameLength = 0;

But I think what’s going wrong is that you’re doing a delay of 2ms in SDL_Delay when you have FPS_LIMIT at 500 (it’s an ‘unsigned int’). So your drawing is taking a fraction of a millisecond but then you’re doing an SDL_Delay of a whole 2ms after your drawing, and not a delay of say 1.9ms

1 Like

I changed frame time from Uint32 to GLdouble and created algorithm that apply extra 1ms or 0ms delay based on how many milliseconds from 0 to 0.9999… remained from previous frame.

If I need to wait 1.7ms then I wait 1ms and keep 0.7ms to use it in next frame.
In the next frame I need to wait again 1.7ms + 0.7ms from previous frame which is 2ms delay and 0.4ms for next frame.
Then it’s 1.7 + 0.4 so SDL_Delay have to use extra 2ms and 0.1ms for the next frame…

Not much better because I got 132 FPS at 150 limit and 345 (the same) FPS at 500 limit.

important processFps() fragment:

if(fpsLimit)
{
	Uint32 lastFrameLength = SDL_GetTicks() - frameStartTime;
			
	if(EXPECTED_FRAME_LENGTH > lastFrameLength) // if last frame was rendered too fast, wait to make FPS limited
	{
		Uint8 waitableOffset = 0;
		GLdouble frameDelayOffsetOverdue = frameDelayOffset + EXPECTED_FRAME_LENGTH_OFFSET; // last remaining waiting time + current remaining time that SDL_Delay should include

		if(frameDelayOffsetOverdue >= 1) // if remaining waiting time is equal or more than 1ms then it's possible that SDL_Delay can fianlly wait that time
			waitableOffset = floor(frameDelayOffsetOverdue);

		frameDelayOffset = frameDelayOffsetOverdue - floor(frameDelayOffsetOverdue); // remaining waiting time that will be used in next frame by SDL_Delay if there will be at least 1ms (sum of offsets) to wait
				
		SDL_Delay((floor(EXPECTED_FRAME_LENGTH) - lastFrameLength) + waitableOffset); // floor() is important because offset of EXPECTED_FRAME_LENGTH was extracted and assigned to waitableOffset
	}
}

new properties:

GLdouble frameDelayOffset = 0;
static constexpr GLdouble EXPECTED_FRAME_LENGTH = 1000.0 / FPS_LIMIT; // Length in milliseconds of frame that has speed limit set by FPS_LIMIT
static const GLdouble EXPECTED_FRAME_LENGTH_OFFSET = EXPECTED_FRAME_LENGTH - floor(EXPECTED_FRAME_LENGTH); // The time from range <0 - 1) that SDL_Delay should use (but cannot, except 0) to keep fps cap precise

I feel very bad for this algorithm because I debuged it and it works as expected but results are still very inaccurate.
Also it seems like nobody ever had this super typical and popular problem or I just don’t know how to use Google and this forum.

I’m struggling to understand your code now, but how about trying this? I’ve not compiled it, just typed it out here, so sorry if there’s anything wrong!

int iFrames = 0;
int iMillisecondsAtStart = SDL_GetTicks();
int iMillisecondsElapsedTotal = 0;
float fMillisecondsThatShouldHaveElapsed;
int iDifferenceInMilliseconds;

if (fpsLimit)
{
// milliseconds that should have elapsed
iFrames++;
fMillisecondsThatShouldHaveElapsed = (float)iFrames * 1000.0f / float(FPS_LIMIT);

// milliseconds that have actually elapsed
iMillisecondsElapsedTotal = SDL_GetTicks() - iMillisecondsAtStart;

// difference
iDifferenceInMilliseconds = floorf(fMillisecondsThatShouldHaveElapsed) - iMillisecondsElapsedTotal;

if (iDifferenceInMilliseconds > 0)
{
SDL_Delay(iDifferenceInMilliseconds);
}
}

This is a very hard problem to solve while making a thread sleep.
Sleeping threads will only guarantee that it sleeps for the specified time, you are not guaranteed that it will wake up exactly after x ms. It takes time to re queue your thread which is decided by the OS / Kernal.

Majority of people I reckon do 2x things.

  1. While loop with no sleeping checking frametime. (Burn CPU Cycles)
  2. Enable vsync and sync with the monitor (Not such a good solution for devices with different refresh rates).

Here is an example I made which works ok-ish
Its c++11 and doesn’t use SDL but maybe you can use the same idea.
Basic idea is take away the overtime spent sleeping away from the frame time of the next update.

e.g.
1)
We have 16 ms to update
Update took 6 ms
Sleep for 10 ms (Actually took 11 ms to wake up and be active again)
2)
We have 16 ms to update minus 1 ms we overslept for
Update took 6 ms
Sleep for 9ms

How to use

void loop() {
 FPSLimiter<> fpsLimiter;
    while (running) {
        // Do stuff
       
        fpsLimiter.run();
    }
}

// FPSLimiter.h

#pragma once
#include <chrono>
#include <thread>

template <typename C = std::chrono::high_resolution_clock>
class FPSLimiter {
  public:
    explicit FPSLimiter(int64_t frameTime = 16666666) :
        m_frameTime    { frameTime },
        m_startTime    { C::now().time_since_epoch().count() },
        m_sleepTime    { 0 },
        m_frameTimeDebt{ 0 } {
    }

    void run() {
        m_sleepTime = (m_frameTime - m_frameTimeDebt) - (C::now().time_since_epoch().count() - m_startTime);
        if (m_sleepTime > 0) {
            std::this_thread::sleep_for(std::chrono::nanoseconds(m_sleepTime));
        }
        m_frameTimeDebt = (C::now().time_since_epoch().count() - m_startTime) - (m_frameTime - m_frameTimeDebt);
        m_startTime = C::now().time_since_epoch().count();
    }

  private:
    int64_t m_frameTime;
    int64_t m_startTime;
    int64_t m_sleepTime;
    int64_t m_frameTimeDebt;
}
1 Like

THANK YOU @Smiles

Somehow I didn’t thought about searching Google for some nanoseconds sleep function :face_with_monocle:

I’m refactoring your code now.
I’ll soon post a solution with your refactored FPSLimiter.h or my refactored function that will implement std::chrono::high_resolution_clock

Test using the class I created to see if youbget the correct results.

Sleeping for nanoseconds will have the sane issues as ms

Yes I just tested it @Smiles !

Now I’m mad because it seems like wake-up-time of these functions.is really long. I really hoped that this_thread::sleep_for will solve the problem.

I tried to implement your logic my way since like 8 hours but I’ done with weird chrono behaviour :face_vomiting:
You try to check the time since first frame and I try to check dynamically every frame.

This is my code with weird unexpected behaviour I cannot even explain. It’s like this else is not firing

else
{
	this_thread::sleep_for(nanoseconds(sleepTime - wakeUpTimeEpsilon)); // wakeUpTimeEpsilon is speed-up parameter because last frame started rendering too late
						
	wakeUpTimeEpsilon = high_resolution_clock::now().time_since_epoch().count() - sleepTime - lastFrameTime - frameStartTime; // additional unexpected wake up time that should be subtracted from next sleeping time
}

and then suddenly it fires and make sleep forever. wakeUpTimeEpsilon number seems to to very high after some time.

Sadly my algorithm looks good for me but I don’t see the reason it doesn’t work like yours.

CODE:

Uint16 iFrames = 0; // Frames amout in 1 second
Uint64 iFramesTime = 0; // Time from 0 to 1 second for all rendered frames
Uint64 frameStartTime = 0;
Uint64 wakeUpTimeEpsilon = 0;
static constexpr Uint16 FPS_LIMIT = 500; // 2001 and more bugged
static constexpr Uint64 EXPECTED_FRAME_TIME = 1000000000 / FPS_LIMIT; // Expected render time in nanoseconds of frame that has speed limited by FPS_LIMIT
void processFps(bool showFps = true, bool fpsLimit = false)
{
	if(fpsLimit)
	{
		Uint64 lastFrameTime = high_resolution_clock::now().time_since_epoch().count() - frameStartTime;
		Uint64 sleepTime = EXPECTED_FRAME_TIME - lastFrameTime;

		if(sleepTime > 0) // if last frame was rendered too fast, wait to make FPS limited
		{
			if(sleepTime <= wakeUpTimeEpsilon) // if e.g. it's needed to sleep 0.2ms(sleepTime) but at the same time needed to speed up by 0.3ms(wakeUpTimeEpsilon) then it's needed to speed up by 0.1ms
				wakeUpTimeEpsilon -= sleepTime;
			else
			{
				this_thread::sleep_for(nanoseconds(sleepTime - wakeUpTimeEpsilon)); // wakeUpTimeEpsilon is speed-up parameter because last frame started rendering too late
						
				wakeUpTimeEpsilon = high_resolution_clock::now().time_since_epoch().count() - sleepTime - lastFrameTime - frameStartTime; // additional unexpected wake up time that should be subtracted from next sleeping time
			}
		}
	}

	if(showFps)
	{
		iFrames++;
		iFramesTime += high_resolution_clock::now().time_since_epoch().count() - frameStartTime;

		if(iFramesTime >= 1000000000) // if the frames reached 1 second of rendering time
		{
			cout << iFrames << " fps" << endl;

			iFrames = 0;
			iFramesTime = 0; // probably more precise than iFrames = 0;
		}
	}
}
while(true)
{
	frameStartTime = high_resolution_clock::now().time_since_epoch().count();

	if(SDL_PollEvent(&wndEvent) && wndEvent.type == SDL_QUIT)
		break;

	glClearColor(0, 0, 0, 0);
	glClear(GL_COLOR_BUFFER_BIT);

	SDL_GL_SwapWindow(window);

	processFps(true, true);
}

I may have confused you with my values, here is an SDL version of what I did.
Just pass how many frame per second you want into the constructor.

Just note that SDL_GetTicks() may not give you the precision you need.

void App::run()
{
    FPSLimiter fps(100);
    while (m_window.isOpen()) {
        m_window.update();
        update();
        draw();
        fps.run();
    }
}

#pragma once
#include <SDL.h>

class FPSLimiter
{
  public:
    explicit FPSLimiter(int64_t frameTime = 30)
        : m_frameTime{ 1000 / frameTime }
        , m_startTime{ SDL_GetTicks() }
        , m_sleepTime{ 0 }
        , m_frameTimeDebt{ 0 }
    {
    }

    void run()
    {
        m_sleepTime = (m_frameTime - m_frameTimeDebt) - (SDL_GetTicks() - m_startTime);
        if (m_sleepTime > 0) {
            SDL_Delay(m_sleepTime);
        }
        m_frameTimeDebt = (SDL_GetTicks() - m_startTime) - (m_frameTime - m_frameTimeDebt);
        m_startTime     = SDL_GetTicks();
    }

  private:
    int64_t m_frameTime;
    int64_t m_startTime;
    int64_t m_sleepTime;
    int64_t m_frameTimeDebt;
};

Your original code is 100% understandable and 300% better.

I just wanted to prove what I said post above.

Thanks for your help!
I don’t expect from anybody to study my last post, it’s really waste of time except someone is gonna achive your algorithm different way.

If you need anymore help please ask. :+1: