lectem's profile picture. Mostly moved to the place where the sky is blue.
Performance, GPU APIs, multithreading and game engines specialist @siliceum.

Does reverse engineering for fun.

Clément Grégoire

@lectem

Mostly moved to the place where the sky is blue. Performance, GPU APIs, multithreading and game engines specialist @siliceum. Does reverse engineering for fun.

So in the end, it seems that it was indeed some Minifilter that caused CreateFile to be so slow on PC. (Corporate Security software it seems). But this doesn't explain the issues seen on console... Perhaps it's not optimized for this at all since games tend to have big files ?

Performance people, I need your help ! What's the fastest way to open a lot of files on windows ? CreateFileW is slow as hell (slower than reading the file itself), are there ways to make it faster (some flags ? warming the MFT ?), obvisouly without changing the files? Thanks !



Damn, this sounds bad, as in "we have a process doing some heavy polling to see if someone drags and drops and eating CPU/blocking a core mutex" somewhere. Would love to see an ETL trace, but that's one reason why I'm still in Win10.

PSA: Apparently there is a kind of a fix for the 24H2 low intel CPUs perf. It's a very strange fix that basically disables in registry the option to drag & drop pins on your taskbar & apparently disabling that gives more perf... yeah I know🤦‍♂️ overclock.net/threads/24h2-c…



Fixed it: Most folks don't even know what a profiler is.

most linux folks don't even understand the point of having a profiler with a GUI. Curious, what horrors @SuperluminalSft will find when they will have the linux port up and working :) I still remember when guys started to actually run a profiler on blender :D



If you are interested in the challenges of new CPUs, have a look at this talk! (especially the 2nd half related to NUMA, which I predict will be the most challenging)


I'm starting to slowly see loss of performance related to using `null` instead of `undefined` in @typescript. Given how much those constructs are used everywhere in the ecosystem, it's death by thousand cuts. (our JS engine is without JIT, no opt here) typescriptlang.org/play/?#code/C4…

lectem's tweet image. I'm starting to slowly see loss of performance related to using `null` instead of `undefined` in @typescript.

Given how much those constructs are used everywhere in the ecosystem, it's death by thousand cuts. 

(our JS engine is without JIT, no opt here)

typescriptlang.org/play/?#code/C4…
lectem's tweet image. I'm starting to slowly see loss of performance related to using `null` instead of `undefined` in @typescript.

Given how much those constructs are used everywhere in the ecosystem, it's death by thousand cuts. 

(our JS engine is without JIT, no opt here)

typescriptlang.org/play/?#code/C4…

Clément Grégoire đã đăng lại

It's not very reliable generally. For waiting, one option is to use CreateWaitableTimerExW with the CREATE_WAITABLE_TIMER_HIGH_RESOLUTION flag.


Note to myself (again) : If your GPU shows abnormal timings, check your VRAM usage and find the leaking culprit using process Explorer.


Clément Grégoire đã đăng lại

Turns out EVERYONE WAS WRONG about N64 performance bottlenecks for 28 years. It's not fillrate or memory throughput. At this point, my N64 game is so memory/rdp optimized, that it's actually RSP that is holding up the whole operation.


Hey perf Xitter! I am setting up a list for software performance ! Focusing on accounts that talk mostly about performance, not politics or cats, even though I like cats ;) Who do you think is missing ? Don't hesitate to subscribe to it ! x.com/i/lists/183179…


Clément Grégoire đã đăng lại

Measure, measure and measure again. Too often I encounter programmers who argue about software performance. That’s about as useful as arguing that your code must be correct. It is not nothing but computer systems are complex. You may think that you have everything figured…

lemire's tweet image. Measure, measure and measure again.

Too often I encounter programmers who argue about software performance.

That’s about as useful as arguing that your code must be correct.

It is not nothing but computer systems are complex. You may think that you have everything figured…

Clément Grégoire đã đăng lại

This is also why for some games, older CPUs with less cores and better single core perf wins, even if the engine usually shows very good core scaling. Amdahl law's a bitch.


Loading...

Something went wrong.


Something went wrong.