🌐
Videos Blog About Series πŸ—ΊοΈ
❓
πŸ”‘

Performance Engineering for the Layman πŸ”—
1643415182  

🏷️ blog

As my nephews are coming of age, I'm considering taking an apprentice. This has resulted in me thinking more of how I might explain programming best practices to the layman. Today I'd like to focus on performance.

Suppose you had to till, plant and water an arbitrary number of acres. Would you propose ploughing a foot, planting a seed and watering ad nauseum? I suspect not. This is because context switching costs a great deal. Indeed, the context switches involved between planting, seeding and watering will end up being the costliest action when scaling this (highly inefficient) process to many acres.

This is why batching of work is the solution everyone reaches for instinctively. It is from this fact that economic specialization developed. I can only hold so much in my own two hands and can't be in two places at once. It follows that I can produce far more washed dishes or orders being a cook or dish-washer all day than I can switching between the tasks repeatedly.

That said, doing so only makes sense at a particular scale of activity. If your operational scale can't afford specialized people or equipment you will be forced to "wear all the hats" yourself. Naturally this means that operating at a larger scale will be more efficient, as it can avoid those context switching costs.

Unfortunately, the practices adopted at small scale prove difficult to overcome. When these are embodied in programs, they are like concreting in a plumbing mistake (and thus quite costly to remedy). I have found this to be incredibly common in the systems I have worked with. The only way to avoid such problems is to insist your developers not test against trivial data-sets, but worst-case data sets.

Optimizing your search pattern

When ploughing you can choose a pattern of furroughing that ends up right where you started to minimize the cost of the eventual context switch to seeding or watering. Almost every young man has mowed a lawn and has come to this understanding naturally. Why is it then that I repeatedly see simple performance mistakes which a manual laborer would consider obvious?

For example, consider a file you are parsing to be a field, and lines to be the furroughs. If we need to make multiple passes, it will behoove us to avoid a seek to the beginning, much like we try to arrive close to the point of origin in real life. We would instead iterate in reverse over the lines. Many performance issues are essentially a failure to understand this problem. Which is to say, a cache miss. Where we need to be is not within immediate sequential reach of our working set. Now a costly context switch must be made.

All important software currently in use is precisely because it understood this, and it's competitors did not. The reason preforking webservers and then PSGI/WSGI + reverse proxies took over the world is because of this -- program startup is an important context switch. Indeed, the rise of Event-Driven programming is entirely due to this reality. It encourages the programmer to keep as much as possible in the working set, where we can get acceptable performance. Unfortunately, this is also behind the extreme bloat in working sets of programs, as proper cache loading and eviction is a hard problem.

If we wish to avoid bloat and context switches, both our data and the implements we wish to apply to it must be sequentially available to each other. Computers are in fact built to exploit this; "Deep pipelining" is essentially this concept. Unfortunately, a common abstraction which has made programming understandable to many hinders this.

Journey to flatland

Object-Orientation encourages programmers to hang a bag on the side of their data as a means of managing the complexity involved with "what should transform this" and "what state do we need to keep track of doing so". The trouble with this is that it encourages one-dimensional thinking. My plow object is calling the aerateSoil() method of the land object, which is instantiated per square foot, which calls back to the seedFurroughedSoil() method... You might laugh at this example (given the problem is so obvious with it), but nearly every "DataTable" component has this problem to some degree. Much of the slowness of the modern web is indeed tied up in this simple failure to realize they are context switching far too often.

This is not to say that object orientation is bad, but that one-dimensional thinking (as is common with those of lesser mental faculties) is bad for performance. Sometimes one-dimensional thinking is great -- every project is filled with one-dimensional problems which do not require creative thinkers to solve. We will need dishes washed until the end of time. That said, letting the dish washers design the business is probably not the smartest of moves. I wouldn't have trusted myself to design and run a restaurant back when I washed dishes for a living.

You have to consider multiple dimensions. In 2D, your data will need to be consumed in large batches. In practice, this means memoization and tight loops rather than function composition or method chaining. Problems scale beyond this -- into the third and fourth dimension, and the techniques used there are even more interesting. Almost every problem in 3 dimensions can be seen as a matrix translation, and in 4 dimensions as a series of relative shape rotations (rather than as quaternion matrix translation).

The outside view

Thankfully, this discussion of viewing things from multiple dimensions hits upon the practical approach to fixing performance problems. Running many iterations of a program with a large dataset under a profiling framework (hopefully producing flame-graphs) is the change of perspective most developers need. Considering the call stack forces you into the 2-dimensional mindset you need to be in (data over time).

This should make sense intuitively, as the example of the ploughman. He calls furrough(), seed() and water() upon the dataset consisting of many hectares of soil. Which is taking the majority of time should be made immediately obvious simply by observing how long it takes per foot of soil acted upon per call, and context switch costs.


Audit::Log released to CPAN πŸ”— 1642470899  

🏷️ video 🏷️ blog 🏷️ troglovlog 🏷️ perl
For those of you interested in parsing audit logs with perl.

Looks like I need to make some more business expenses if I want to be able to stream 4k video!

Async/Await? Real men prefer Promise.all() πŸ”—
1615853053  

🏷️ video 🏷️ blog

I've been writing a bunch of TypeScript lately, and figured out why most of the "Async" modules out there are actually fakin' the funk with coroutines.

Turns out even pedants like programmers aren't immune to meaning drift! I guess I'm an old man now lol.

Article mentioned: Troglodyne Q3 Open Source goals


Link Unfurling with HTML::SocialMeta πŸ”—
1609954054  

🏷️ video 🏷️ tcms 🏷️ blog
I did a deep dive into how pasted links turn into previews in chat and social media applications and was pleasantly surprised to find CPAN had the solution for me. I found a couple of gotchas you might want to know about if you don't want to figure this out the hard way.

tCMS Hacking VII: Mixed Content Warnings πŸ”—
1609455753  

🏷️ video 🏷️ streams
A common problem in websites is the "Mixed Content Warning" on SSL virtualHosts. In the end it becomes yet another "I should (and do) know better" stream, lol

tCMS Hacking VI: How programming usually goes πŸ”—
1609454786  

🏷️ video 🏷️ streams
I tried to fix a bug, but had to fix other things first. This is how most days go when you are programming.

tCMS Deploys using Buildah and Podman πŸ”—
1609442334  

🏷️ video 🏷️ streams
Branching out thanks to our friends over at the Houston Linux User's Group.

tCMS Hacking V: Speeding up Docker deployment with overlays πŸ”—
1609292913  

🏷️ video 🏷️ streams
The fundamental motivation for all programmers -- "this is taking to long!"

Speaking of, this stream took way too long because the docu I was looking at was solving a different problem (smaller disk size than less time).

Feed my greedy algos!!!1

tCMS Hacking IV: Practical concerns when doing docker deploys πŸ”—
1609273138  

🏷️ video 🏷️ streams
Try not to stick your hands in the guts of your containers unless you want jungle diseases. Here's a practical example of doing the targeted surgery required to keep sane.

tCMS Hacking III: Filter your REQUEST_URI or you'll die πŸ”—
1609264670  

🏷️ video 🏷️ streams
Yet another from the "I should know better" (and do) files. A little dab of regex will do.

tCMS Hacking II: Making schema updates πŸ”—
1608089881  

🏷️ video 🏷️ streams
Cleaning up after your SQL mistakes is important. Here's how to do it in a way that minimizes downtime.

tCMS Hacking: Removing unneeded schema πŸ”—
1608089748  

🏷️ video 🏷️ streams
From the "I should know better" files. Having predictable user identifiers which are also superfluous is pretty bad, and something I already knew not to do, but hey, worse is better!

Playwright for Perl: Update 2 πŸ”—
1607806104  

🏷️ video 🏷️ testing
Wherein big progress is made.

Playwright for Perl! πŸ”—
1607804450  

🏷️ video 🏷️ testing
Selenium is dead. Long live Playwright! Though just at the start of things today, surprisingly good progress has been made already.
https://github.com/teodesian/playwright-perl

Add an Emoji Picker to your Website! πŸ”—
1607571375  

🏷️ video
Had a desire to add some emojis earlier to a news posting I was doing for teodesian.net. Decided to do a few seconds of googlin' and found a pretty decent looking library for doing this here:

https://github.com/woody180/vanilla-javascript-emoji-picker

Figured I may as well let y'all ride shotgun with me as I added this sucker to my posting UI as such. Hopefully it helps someone out there on the interwebs!

Programming Videos πŸ”—
1607567362  


Random videos involving actual code slinging

25 most recent posts older than 1607567362
Size:
Jump to:
POTZREBIE
© 2020-2023 Troglodyne LLC