🌐
Videos Blog About Series πŸ—ΊοΈ
❓
πŸ”‘
/index:

Fear and loathing at YAPC πŸ”— 1720542356  


Despite being the worst attended YAPC in recent memory, 2024's show in Vegas had some of the best talks in a long while. In no particular order, the ones I remember after a week are:

  • Damian's talk - Implements most of what you want out of a type system in perl, one of the points in my testing talk
  • Demetrios's talk - Savings from this alone will save me more than the conference cost me
  • Gavin Hayes' WASM talk - has big implications in general, and I will try this in playwright-perl soon
  • Gavin's APPerl talk - I can see a use for this with clients immediately
  • Exodist's Yearly roundup of what's new in test2 - The PSGI app he's built into it implements a lot of my testing talk's wish list
  • Cromedome's Build a better readme - Good practical marketing advice
I would have loved to have seen the velociperl fellow show up, but I can't say I'm shocked given how attempts circumvent P5P paralysis in such a manner have ended up for the initiators thus far.

This year we had another Science track in addition to the perl and raku tracks which I submitted my testing talk to. In no particular order, the ones I enjoyed were:

  • Adam Russell's paper - Using LLMs to make building semantic maps no longer pulling teeth? sign me up!
  • Andrew O'Neil's paper - Like with 3D printing, these handheld spectrographs are going to change the world.
The track generated a fair bit of controversy due to a combination of Will and Brett being habitual irritants of Perl's In-Group, miscommunication and associated promotional efforts. While I regard their efforts as being in good faith, I doubt the TPRF board sees it that way, given they issued something of a condemnation during the final day's lightning talks. Every year somebody ends up being the hate object; victims need to be sacrificed to hutzilopotchli to keep the sun rising on the next conference.

That being said, the next conference is very much in doubt. Due mostly to corporate sponsorship of employee attendance largely drying up, the foundation took a bath on this one. I'm sure that the waves of mutual excommunications and factionalism in the perl "community" at large hasn't helped, but most of those who put on such airs wouldn't deign to have attended in the first place. My only productive thought would be to see what it is the Japanese perl conference is doing, and ape our betters. Lots of attendance, and they're even doing a second one this year. Must be doing something right.

My Talks

I got positive feedback on both of my talks. I suspect the one with the most impact will be the playwright one, as it has immediate practical impact for most in attendance. That said, I had the most productive discussions coming out of the testing talk. In particular the bit at the start where I went over the case for testing in general exposed a lot of new concepts to people. One of the retirees in the audience who raised the point that the future was "Dilbert instead of Deming" was right on the money. Most managers have never even heard of Deming or Juran, much less implemented their ideas.

Nevertheless, I suspect it was too "political" for some to call out fraud where I see it. I would point out that my particular example used (Boeing) is being prosecuted for fraud as of this writing. Nevertheless, everyone expects they'll get a slap on the wrist. While "the ideal amount of fraud in a system is nonzero" as patio11 puts it, the systematic distribution of it and near complete lack of punishment is (as mentioned in the talk) quite corrosive to public order. It has similar effects in the firm.

My lack of tolerance for short-sighted defrauding of customers and shareholders has got me fired on 3 occasions in my life, and I've fired clients over it. I no longer fear any retaliation for this, and as such was able to go into depth on why to choose quality instead. Besides, a reputation for uncompromising honesty has it's own benefits. Sometimes people want to be seen as cleaning up their act, after all.

I enjoyed very much working with LaTeX again to write the paper. I think I'll end up writing a book on testing at some point.

I should be able to get a couple of good talks ready for next year, supposing it happens. I might make it to the LPW, and definitely plan on attending the Japanese conference next year.


Why configuration models matter: WebServers πŸ”— 1715714895  


Back when I worked at cPanel, I implemented a feature to have customer virtualhosts automatically redirect to SSL if they had a valid cert and were configured to re-up it via LetsEncrypt (or other providers). However this came with a significant caveat -- it could not work on servers where the operator overrode our default vhost template. There is no way you can sanely inject rules into an environment where I don't even know if the template is valid. At least not in the amount of time we had to implement the project.

Why did we have this system of "templates" which were then rendered and injected into Apache's configuration file? This is because it's configuration model is ass-backwards and has no mechanism for overriding configs for specific vhosts. Its fundamental primitive is a "location" or "directory" which have a value which is either a filesystem or URI path component.

Ideally this would instead be a particular vhost name, such as "", "127.0.0.1, "foobar.test" or even multiple of them. But because it isn't we saw no benefit to using the common means of separating configs for separate things (like vhosts), the "config.d" directory. Instead we parse and generate the main config file anytime a relevant change happens. In short we had to build a configuration manager, which means that now manual edits to fix anything will always get stomped. The only way around that is to have user-editable templates that are used by the manager (which we implemented by a $template_file.local override).

Nginx recognized this, and their server primitive directive is organized around vhosts. However they did not go all the way and make it to where you could have multiple server blocks referring to the same vhost with the last one encountered, say in the config.d/ directory, taking precedence. It is not stated in the documentation, but later directives referring to the same host do the same thing as apache. As such configuration managers are still needed when dealing with nginx in a shared hosting context.

This is most unfortunate as it does not allow the classic solution to many such problems in programming to be utilized: progressive rendering pipelines. Ideally you would have a configuration model like so:


vhost * {
    # Global config goes here
    ...
}

include "/etc/httpd/conf.d/*.conf"

# Therein we have two files, "00-clients-common.conf"
vhost "foobar.test" "baz.test" {
    # Configuration common to various domains go here, overrides previously seen keys for the vhost(s)
    ...
}

# And also "foobar.test.conf"
vhost "foobar.test" {
    # configuration specific to this vhost goes here, overrides previously seen keys for the vhost
    ....
}

The failure by the web server authors to adopt such a configuration model has made configuration managers necessary. Had they adopted the correct configuration model they would not be, and cPanel's "redirect this vhost to ssl" checkbox would work even with client overrides. This is yet another reason much of the web has relegated the web server to the role of "shut up and be a reverse proxy for my app".

At one point another developer at cPanel decided he hated that we "could not have nice things" in this regard and figured out a way we could have our cake and eat it too via mod_macro. However it never was prioritized and died on the vine. Anyone who works in corporate long enough has a thousand stories like this. Like tears in rain.

nginx also doesn't have an equivalent to mod_macro. One of the few places apache is in fact better. But not good enough to justify switching from "shut up and reverse proxy".


Why you should use the Rename-In-Place pattern in your code rather than fcntl() locking πŸ”— 1714508024  

🏷️ perl

Today I submitted a minor patch for File::Slurper::Temp. Unfortunately the POD there doesn't tell you why you would want to use this module. Here's why.

It implements the 'rename-in-place' pattern for editing files. This is useful when you have multiple processes reading from a file which may be written to at any time. That roughly aligns with "any non-trivial perl application". I'm sure this module is not the only one on CPAN that implements this, but it does work out of the box with File::Slurper, which is my current favorite file reader/writer.

Why not just lock a file?

If you do not lock a file under these conditions, eventually a reader will consume a partially written file. For serialized data, this is the same as corruption.

Using traditional POSIX file locking with fcntl() using an RW lock comes with a number of drawbacks:

  • It does not work on NFS - at all
  • Readers will have to handle EINTR correctly (e.g. Retry)
  • In the event the lock/write code is killed midstream you need something to bash the file open again
Writing to a temporary file, and then renaming it to the target file solves these problems.

This is because rename() just changes the inode for the file. Existing readers continue reading the stale old inode happily, never encountering corrupt data. This of course means there is a window of time where stale data is used (e.g. the implicit TOCTOU implied in any action dependent on fread()). Update your cache invalidation logic accordingly, or be OK with "eventual consistency".

Be aware of one drawback here: The temporary file (by default) is in the same directory as the target as a means of avoiding EXDEV. This is the error you get from attempting to rename() across devices, as fcopy() is more appropriate there. If you are say, globbing across a directory with no filter, hilarity may ensue. You should change this to some other directory which is periodically cleaned on the same disk, or given enough time & script kills it will fill.


KYC: A bad idea for the hosting industry πŸ”— 1714071973  

🏷️ regulation

I try not to ever get political if I can help it here, as that's always the wrong kind of attention for a business to attract. However I'm going to have to today, as the eye of sauron is directly affixed on my industry today. If that's not for you, I encourage you to skip this article.

As of this writing, there is a proposed rule change working its way through the bowels of the Department of Commerce. Hot on the heels of the so-called "TikTok ban" (which would more rightly be called forced divestiture e.g. "nationalization through the back door"), this rule change would require all web hosting, colo and virtual service providers to subject their customers to a KYC process of some sort.

The trouble always lies in that "of some sort". In practice the only way to comply with regulations is to have a Contact Man [1]" with juice at the agency that thinks like they think. Why is this? Because regulations are always what the regulator and administrative law judge think they are. Neither ignorance or full knowledge of the law is an effective defense; only telepathy is.

This means you have to have a fully loaded expense tacked onto your business. Such bureaucrats rarely come cheap, oftentimes commanding six figure salaries and requiring support staff to boot. Compliance can't ever be fully automated, as you will always be a step behind whatever hobgoblin has taken a hold of the bureau today.

Obviously this precludes the viability of the "mom and pop hosting shop", and even most of our mittlestand. This is atop the reduction in overall demand due to people who don't value a website as much as their privacy, or the hassle of the KYC process itself. This will cause widespread economic damage to an industry already reeling from the amortization changes to R&D expenses. This is not the end of the costs however.

KYC means you have to keep yet more sensitive customer information atop things like PII and CC numbers. This means even more stuff you have to engage in complicated schemes to secure, and yet another thing you have to insure and indemnify against breach.

However the risks don't stop with cyber-criminals looking to steal identities. The whole point of KYC is to have a list that the state can subpoena whenever they are feeling their oats. Such information is just more rope they can put around you and your customers' necks when that time comes. Anytime you interact with the state, you lose -- it's just a matter of how much. This increases that "how much" greatly.

Do you think they won't go on a fishing expedition based on this information? Do you really trust a prosecutor not to threaten leaking your book to a competitor as a way of coercing a plea, or the local PD holding it over you for protection money? Don't be a fool. You'll need to keep these records in another jurisdiction to minimize these risks.

On top of this, no actual problem (e.g. cybercrime) will be addressed via these means (indeed these problems will be made manifestly worse). Just like in the banking world, the people who need to engage in shenanigans will remain fully capable of doing so. No perfect rule or correct interpretation thereof exists or can exist. The savvy operators will find the "hole in the sheet" and launder money, run foreign intel ops and much worse on US servers just as much as they do now.

A few small-time operators will get nicked when the agency needs to look good and get more budget. The benefit to society of removing those criminals will be overwhelmed by the negatives imposed to business and the taxpayer at large.

Many other arguments could easily be made against this, such as the dubious legality of administrative "law" in the first place. Similarly, this dragooning of firms into being ersatz cops seems a rather obvious 13th amendment violation to me. However just like with regulators, the law is whatever judges think it is. Your or my opinion and the law as written is of no consequence whatsoever. As such you should expect further consolidation and the grip of the dead hand to squeeze our industry ever tighter from now on.

Notes

[1] GΓΌnter Reimann - "The Vampire Economy", Ch. 4

ARC and the SRS: Stop the email insanity πŸ”— 1713224239  

🏷️ email

There's a problem with most of the mail providers recently requiring SPF+DKIM+DMARC. Lots of MTAs (exchange, mailman etc) are notorious for rewriting emails for a variety of reasons. This naturally breaks DKIM, as they don't have the needed private key to sign messages which they are forwarding. And given the nature of the email oligopoly means you absolutely have to be under the protection of one of the big mass mailers with juice at MICROS~1 or Google, this necessitated a means to "re-sign" emails.

This is where SRS came in as the first solution. Easy, just strip the DKIM sig and rewrite the sender right? Wrong. Now you are liable for all the spam forwarded by your users. Back to the drawing board!

So, now we have ARC. We're gonna build a wall chain of trust, and we're gonna make google pay for it! But wait, all DKIM signatures are self-signed. Which means that peer-to-peer trust has to be established. Good luck with that as one of the mittlestand out there. Google can't think that small.

I can't help but think we've solved this problem before. Maybe in like, web browsers. You might think that adopting the CA infrastructure in Email just won't work. You'd be wrong. At the end of the day, I trust LetsEncrypt 100000% more than Google or MICROS~1.

So how do we fix email?

The core problem solved by SPF/DKIM/DMARC/SRS/ARC is simple. Spoofing. The sender and recipient want an absolute guarantee the message is not adulterated, and that both sides are who they say they are. The web solved this long ago with the combination of SSL and DNS. We can do the same, and address the pernicious reality of metadata leaks in the protocol.

Email servers will generally accept anything with a To, From, Subject and Body. So, let's give it to them.


To: $RECIPIENT_SHA_1_SUM@recipient-domain.test
From: $USERNAME_SHA_1_SUM@sender-domain.test
Subject: Decrypt and then queue this mail plz

(encrypted blob containing actual email here)

Yo dawg, I heard you liked email and security so I put an encrypted email inside your email so you can queue while you queue

Unfortunately, for this to work we have to fix email clients to send these mails which ha ha, will never happen; S/MIME and PGP being case in point. From there we would have to have servers understand them, which is not actually difficult. Servers that don't understand these mails will bounce them, like they do to misconfigured mails (such as those with bad SPF/DKIM/DMARC/SRS/ARC anyways). There are also well established means by which email servers discover whether X feature is supported (EHLO, etc), and gracefully degrades to doing it the old dumbass way. When things are supported it works like this:

  1. We fetch the relevant cert for the sender domain, which is provided to us by the sending server.
  2. We barf if it's self-signed
  3. We decrypt the body, and directly queue it IFF the From: and Envelope From: are both from the relevant domain, and the username's sha1 sum matches that of the original from.
  4. Continue in a similar vein if the recipient matches and exists.
From there you can drop all the rest of it; SPF, DKIM, DMARC what have you. Not needed. SpamAssasin and Milters continue working as normal. If it weren't for forwarding the fact you have to trust your mailserver with your life because all residential IPs are perma-banned, you could even encrypt the sender/reciever domains for marginally more deniability about which vhosts are communicating. That said, some scheme for passing on this info securely to forwards could be devised.

What can't be fixed is the reciever server having to decrypt the payload. The last hop can always adulterate the message due to email not actually being a peer-to-peer protocol because spam. This is what PGP & S/MIME are supposed to address, but failed to do due to not encrypting headers. Of course this could be resolved by the mailserver reaching out to the actual domain for a user that reaches out peer-to-peer and asking for a shared secret. Your mailserver could then be entirely flushed down the commode in favor of LDAP.

So why hasn't it happened, smart guy?

In short, the situation being what it is would be why everyone long ago threw up their hands and said "I may as well just implement a whole new protocol". At some point someone has to do the hard work of pushing a solution like this over the finish line, as people will not stop using email for the same reason we still use obsolete telephones. What is needed is for mailops to reject all servers without MTA-STS and sending unencrypted, adulterated emails. Full Stop.

Unfortunately the oligopoly will not, because their business model is to enable spammers; just like the USPS, that's the majority of their revenue. If I could legally weld shut my mailbox, I would. But I can't because .gov hasn't figured out a way to communicate which isn't letter or fax. It's the same situation with email for anyone running a business. The only comms worth having there are email or zoom; our days are darkness.

The security conscious & younger generations have all engaged in the digital equivalent of "white flight" and embraced alternative messaging platforms. They will make the same mistakes and have to flee once again when their new shiny also becomes a glorified adeverising delivery platform. They all eventually will flee to a new walled garden. Cue "it's a circle of liiiiife" SIIIMMMBAAAA

Is there a better solution?

Yes. It even worked for a number of years; it was called XMPP with pidgin-otr. Email clients even supported it! Unfortunately there wasn't a good bridge for talking with bad ol' email. Now everyone has forgotten XMPP even exists and are gulag'd in proprietary messengers that have troubling links to the spook aristocracy.

The bridge that has to be built would be an LDA that delivers email to an actual secure, federated messaging platform rather than a mailbox in the event it looks like it oughtta. In short, it's a sieve plugin or even a glorified procmailrc. From there you have to have a client that distinguishes between the people you can actually talk to securely rather than email yahoos. Which should also helpfully append a message at the top of the email body instructing people how to stop using email onto replies. As to the actual secure messaging platform, I'd use matrix.

There's probably a product somewhere in here to have a slick mail client which knows how to talk email, matrix, rss and activitypub. After all, we still need mailing lists, and activitypub is a better protocol for doing that. Hell, may as well make it your CMS too. More homework for me.


25 most recent posts older than 1713224239
Size:
Jump to:
POTZREBIE
© 2020-2023 Troglodyne LLC