troglodyne.net : blog

The impact of open source licensing on your business 🔗

1636067872

Recently, OVID had some remarks about using GPL3 code in your projects. The most relevant bit is this:

Do you have any code that cannot be open sourced but uses code with a "permissive" license that in turn uses code with a GPL license?

Congratulations! You now have a court case on your hands if anyone finds out.

The backstory here is that one of his friends has some issues he can't fix without hiring a lawyer.

Normally, this is not too problematic (even when the upstream is hostile) when you use packaged software and libraries, as distributing patches is fairly straightforward. However sometimes there are extenuating circumstances (such as clouded title) as happens thanks to contentious forks. The other circumstance is the viral nature of some licenses such as GPLv3 and Affero. There are even more extremely ideological licenses out there, but few which are of any practical consequence.

Both the normal and Affero GPL have the practical consequence of you needing to either license proprietary data or sell services rather than sell software. That is, unless you adopt a gratuity model which has proven less than viable in the overwhelming majority of circumstances. Even the idea of selling services is difficult to secure against competition, as the recent war between elasticsearch and Amazon proves. It's quite a bitter pill to swallow to be undercut by a competitor using the fruits of your own ongoing labors.

It's not an easy choice to make. Choosing to forgo software with viral licenses means more time-to-market, which is not always available. Similarly, your business model may be to help people with their data, not sell access to yours. Ultimately the only thing you can really rely upon in the long term are your own individual wits and physical capital, licenses and laws notwithstanding. This is why most tech firms (if they survive long enough) end up becoming glorified consulting firms like IBM.

The war on scraping is lost for the same reason as the war on piracy 🔗

1635962678

🏷️ piracy 🏷️ scraping

A great deal of effort is expended upon anti-scraping measures in webpages. There are a number of reasons for this:

prohibitive bandwidth costs involved in allowing bulk downloads
wanting tight control over how users view the data in order to influence their conclusions
competitors stealing content and reproducing their service in a jurisdiction in which they have no legal recourse.

The first concern is best addressed by rate-limiting mechanisms or metering fees. The second concern has become quite a heated topic for social media of late. This is not a problem for most services, as it's not good business to second guess paying customers. On the other hand, if the business is advertising (as it is with social media) influence is precisely the point.

For most businesses the concern will be the last one. I've learned over my career programming that data oriented design not only results in faster code, but less code. It's entirely possible to build a successful business with entirely open source code but proprietary data using this model. That said, it makes one uniquely vulnerable to such data theft.

Can we fix it?

Enter anti-scraping technology. For a good overview of the current landscape, see here. You may have noticed the core problem is "fingerprinting", which is essentially the same one to solve with software licensing. This is because it's the same exact problem as software piracy, programs are just data that transform other data.

Those of you familiar with implementing software licensing schemes like I have are well aware that basically everything but phone-homes coupled with fingerprinting are not worth pursuing. Even then, there is no real way to prevent people from nopping your checks out. Generally you see mechanisms to ensure that a crack for one version does not work on the next. This has resulted in a status quo of customers submitting to this stick with the carrot of forward-going code updates.

Which is to say a stalemate in the immediate term, but total surrender in the long term. The only reliable way to prevent this is to never allow clients to interpret your code. Even then side-channel attacks are possible to reverse-engineer it.

This model breaks down for targeted and simple programs, as after some point there's nothing to update. I suspect this is much of the reason that observation of Zawinski's law is so prevalent in the software industry. There is however no such concern with data, as you can always add more. The video game industry in particular has embraced this with zeal. Expansion content not only drives much of sales, it also works quite well to keep their content artisans fully employed when they might otherwise have downtime.

You may have noticed that the ultimate remedy available to software is not exactly feasible for data. Data cannot be fully obscured from the client in nearly all use cases. Anti-scraping measures (as you can see from the overview) have also failed almost comprehensively. This has had far-reaching effects on a number of industries.

Tech Blogging has been totally smothered by plagiarists who know how to do SEO. The only real reason to do one nowadays is as a big "hire me" billboard. My father was an inventor with a number of patents, and he discovered (the hard way) that they were also useless besides as an inducement for employment. Almost every social media platform which started out with good APIs have now comprehensively crippled or dropped them altogether and an industry of scraping based tools have popped up to satisfy this need. Plaid became a multibillion dollar company by doing scraping of bank websites using bank clients own logins.

Like with software licensing it begs the question of why any of this effort is expended at all, given it's ultimately Canute screaming at the tides. This comes down to legal reasons. The courts generally say that you "had it coming" if you left a gold bar in the middle of the street and it gets stolen. So it is with software and data. If you don't at least make a token effort at anti-circumvention you have no recourse. Of course, this is not consistently applied to all firms and jurisdictions but such is the law. If we wanted consistent outcomes, we'd replace black robes and powdered wigs with programs. Even then this has no bearing internationally, as most firms ability to have recourse there is nil.

Time to get creative

The good news is that it turns out that any effort beyond token prevention in fact hinders your ability to stop piracy. Pirates are inherently lazy, and you can exploit this to get a handle on the problem. For example, I once worked with an IP based licensing scheme that also gathered OS fingerprints passively, but did not do enforcement based on the latter. This allowed some people to feel they were quite clever running a number of instances behind NAT. Periodically they get rounded up (random reinforcement works best for operant conditioning) and told they're gonna get a lifetime ban unless they buy the right number of licenses and sign an NDA about the incident. This was but one example of many where over the years where laying traps for pirates paid off quite handsomely.

Just like with my previous article about the victory of spam, the proper mindset is not to fight but "make the trend your friend". The motivations for piracy and spamming are both deeply ingrained in human nature. The most powerful people and organizations in the world have fought that war against our baser natures for millennia and are still no closer to victory then when they set out. This time will not be different.

Welcome to spamworld, where nobody reads 🔗

1635869893

🏷️ SEO 🏷️ spam

Recently, a resume went viral for getting good responses despite being filled with obvious BS such as rickrolls thanks to being SEO'd out the wazoo. A less obvious variant of this trick for more serious people has been to include a paragraph of text with white font color (so that it is not visible unless selected, or ever when printed) filled with these SEO keywords. While these tricks can open some doors, they still aren't enough because people still don't read what gets past the filters. Some of the reason for this is plain laziness, but the truth is that what gets past the filters are still too much to read.

This can get especially frustrating for programmers looking for contracts that have a large corpus of public work (such as a blog, or OSS contribs). Prospective employers invariably ask you to take yet another test whether or not you have clear and demonstrated ability to solve their business problems. At the end of the day, exploiting social proof is still what's needed to get hired. Whether you leverage a personal connection, build fame or use "jedi mind tricks" to quickly build emotional investment in an interview it always has to be done. The skills you actually need to do the job basically don't matter at all; they're just one more filter.

Why can't we have nice things?

The core issue which is unremarked upon here is that the war on spam is over, and the spammers have decisively won. The only set of spam filters which actually can catch 100% of spam also catches 100% of non-whitelisted ham. The most recent weapon in this war is greytrapping where you blacklist anyone sending to addresses not at the server, as it's evidence of scanning.

I realized that this approach could also be applied various other places to improve web hosting in general, as scanning happens all the time. My /var/log/messages is usually filled with queries for domains that are not, and never have been on the box. You could similarly ban HTTP requests against IPs specifying incorrect HOST headers.

There are a lot of areas where the other techniques applied to email would actually help. Greylisting phone calls in particular would essentially extinguish the epidemic of scam calls immediately. Especially if you combined it with a mandatory up-front first time leave-a-message running a bot-or-not analysis. That said, it appears there is 0 motivation to change in the telephony space. After all, most major smartphones have supported sending and recieving encrypted SIP calls identified by email addresses for years, yet we still trade the equivalent of IP addresses and pay for this!

This still doesn't fix the problem though. Given the only foolproof solution is whitelisting, it surprises me that no major mail package or hosting control panel automatically adds anyone you directly mail to the whitelist. Most don't even auto-whitelist your addressbook!

But wait! There's MORE

There is an even more insidious problem introduced by the net. While there is an endless tide of spam there is also more ham than anyone could ever possibly eat. This is the current state of scientific publishing despite the replication crisis. What happens when the possible routes of investigation are more than you could ever possibly investigate?

While it is possible that multiple routes lead to your destination, it's likely only one of them is optimal. As programmers know well, this is close to an undecidable problem short of exhaustive search. This flood of "not wrong, but not useful" content which increasingly hinders my search for solutions (again, thank you SEO blogs) has grown increasingly concerning.

I've begun to wonder if this will be the mechanism by which the spread of knowledge regresses to the pre-internet mean. I certainly don't relish the days of having to drive to and then search library stacks to get answers. I don't think it'll be as bad as it used to be, but this has major ramifications for AI researchers. If we can barely get through this tide of junk, I suppose it comes as no surprise that "expert systems" turn out to be closer to "mediocrities copying and pasting from stackoverflow".

This is good news for content creators at least. It means that posts like this one where I lead off with some "in the news" thing can easily be evergreened in the future. This is because everyone's social media feed is eternal september of the guy who just started paying attention. As PT Barnum said, there's a sucker born every minute!

Power Distortions in the Firm 🔗

1635781264

🏷️ management 🏷️ corporate

The most corrosive element in any relationship is power, especially when the wielder does not understand the way it subtly warps their interactions with others. Middle management in firms are quite unaware of this, as in the rest of their lives they are powerless peasants like the rest of us. Doing the sort of context switch to make this work does not come naturally, and the means by which we select managers does not select for the self-reflective. Occasionally they develop the necessary faculties, but this necessarily means their advance in rank will cease and much of the good they do will be plowed under by their peers.

This is why much of modern automation in firms is giving dynamite to children. Once managers saw how much things like issue trackers helped teams internally they could not resist using it as yet another lever to micro-control the process. The strength of Auftragstategik is in practice paid no more than lip service.

Having fallen victim to the siren song of automated measurement, they forget that now they have the same problem as search engines. Unscrupulous employees are now be able to SEO their way into the top ranks of performance with very little effort. Much of this is why the urge in firms to pick low-hanging fruit to get up numbers is so widespread. It is also yet another shackle on themselves, management begins to use the same hammer amongst themselves. This further distracts them from their true purpose of resolving systemic barriers to progress.

Management by Reid Technique

I can't think of a better way to induce anxiety and destroy productivity in the workforce than regularly scheduled police interrogations. Which is essentially the primary way in which employees and management interact now, commonly known as the "one on one". Well-meaning managers put out pieces like this on how they can be positive interactions.

The summary is that management generally wants to hear "all is well" so they may return to inaction, as this is easy. Basically anything else is seen as emotional whining they need to pacify at best. At worst the manager goes full on cop mode and fires people over throwing a tantrum. This in particular is quite perceptive:

A Disaster is the end result of poor management. Your employee believes totally losing their shit is a productive strategy and they believe it's the only option left to making anything change.

It is true that many do not resort to communication of facts until incredibly frustrated their subcommunications have been comprehensively ignored. This is a rational response to the actual goal of the meeting, what managers want to hear is ketman so that's what people give them.

A manager which understands the distorting nature of the power they wield would not engage in such tactics. Like torture, one-on-ones can't possibly achieve anything you actually want. All you will hear is what you want to hear, or emotional outbursts which can and should be disregarded.

The only real way to learn the truth is to observe from a dis-empowered position, like Henry V going into camp incognito. It's either that or have spies. This is much of why QA is defined as "providing information to decision makers". The reports from your QA department is what should be finding the problems in the production process you need to resolve.

As to the people problems, an "open door" policy should suffice. If people won't tell you these things until they explode anyways, this at least saves time. This is not the policy by and large, as management is in love with the idea of prevention. While this is indeed the right strategy in the production process, it is dangerously wrong for personal development. Never allowing people to make interpersonal mistakes is to deprive them of essential learning opportunities. Can one truly be said to have repented under the lash? Or be said to be good without having experienced evil and rejected it?

The only way to avoid these distortions is systemic reform of the organization. Scaling organizations without diluting ownership (as in a partnership) inevitably results in the single-elimination ass-kissing tournament. As such we cannot expect anything but self-service (much less reform) from management at large. The attendant mendacity is a cost of doing business in large firms.

Even in a firm without these problems power can still prove corrosive. That said the incentives are at least not aligned against doing the correct thing.

Detect OOMs via cron 🔗

1635525018

🏷️ scripts

I recently had a problem with a hypervisor where the dom0 was underprovisioned for RAM, and it OOMKiller'd the hypervisor processes for VMs during scheduled rsync backups. After nice-ing the processes appropriately to prevent this I decided to implement a simple cron to email me whenever oomkiller fires. This obviously isn't foolproof, as something could be killed preventing outgoing mails. It is however unlikely, so this will likely be a good enough solution going forward.

oomdetect.sh
#!/bin/bash

touch /root/ooms.log
FSZ=$(stat --printf "%s" /root/ooms.log)
grep -i oom-killer /var/log/messages >> /root/ooms.log
echo $(uniq < /root/ooms.log) > /root/ooms.log
NEWSZ=$(stat --printf "%s" /root/ooms.log)

if [ $FSZ != $NEWSZ ]
then
	echo "New OOM detected, investigate /var/log/messages:"
	tail -n1 /root/ooms.log
fi

Pretty simple altogether, just make sure to run it once before you install it to root's crontab. Don't forget that you can send crons to multiple recipients by making the relevant updates to /etc/aliases for that cron's user and running newaliases.

Why are US firms having such a hard time hiring? 🔗

1635261979

🏷️ corporate 🏷️ recruiting

The mainstream narrative throughout the last couple years has been that additional aid to the unemployed was encouraging them to be layabouts. Now that this aid has ceased, people are looking around and realizing the problem lies somewhere else. It turns out that the reality is a combination of both being on the cusp of a demographic cliff and workers fed up with common behaviors of employers in a glutted market.

One such behavior has direct relevance to hiring. Thanks to said oversupply of labor for years, a similar situation to that experienced by women on dating apps has developed. Which is to say a spam crisis. This has resulted in incredibly aggressive filtering measures. Most commonly these are automated filtering, inflated JDs (job descriptions) and deceptive offers. For a while this was working, but lately the successful match rate (Belveridge curve) has nosedived. I suspect I know why.

Automated Filtering and JD inflation

JD inflation started out the same way most product requirements go. Too many cooks adding too many ingredients. What started out as a light scout vehicle becomes the Bradley fighting vehicle.

Eventually programmers got a hold of these documents and realized they could apply search techniques to try and improve match rates. This provided great results for both employers and workers for the first decade of the 21st century. However, like the web it became filled with spam and SEO'd content and the signal-to-noise ratio plummeted.

The practical consequences of this are twofold:

Honest qualified applicants are either discouraged or screened out
All hiring happens in practice through personal connections and recruiters

The latter is particularly concerning as recruiters are quite expensive per hire. This means that the only cost-effective way small firms can acquire talent is via personal connections. The former is also corrosive, as many on both sides have come to accept that you have to cynically game this system to get good results.

Deceptive Offers, Liar Hires

Recently a story went viral about the abysmal rate of responses a qualified applicant got for entry level jobs, and how the offer was always lower than advertised on those that did respond. It should shock nobody that marginal firms engage in catfishing to game this system and get better hires than they could normally. So long as advantage may be acquired through dishonest practices people will try it absent any meaningful sanction. All's fair in love and war and hiring.

Similarly many prospective hires get good results using jedi mind tricks pioneered by salesmen and pickup artists to rapidly build emotional investment in their counterparts. This causes a great deal of resentment in those who prefer devoting their limited time to professional excellence when they see this results in the (relatively) unskilled getting ahead and them being left behind.

Much of what we are seeing with our worker shortage is these resentments finally boiling over. Professionals either decide to "play the game" or take their ball and go home. This results in turnover at worst and checked-out disloyal workers at best.

Firms have tried to prevent this by fostering a cult atmosphere via paternalist measures and propaganda. This is both expensive and ineffective outside of the short-term. Competitors have all adopted similar benefits and anyone paying attention is wise to the BS now. Joshua Fluke has made a career on youtube pillorying this nonsense. The only option left to employers is actually raising wages.

That said the astonishing levels of average household debt continues to weaken workers' position. They ultimately don't have enough savings to give an outright no. The American worker's only choice is who to say yes to. Woe unto those working in highly consolidated industries, where competition is less meaningful of a bargaining chip.

Demographics

What is making this even worse are demographic trends. The boomers are taking these frustrations they've been dealing with for years and being at or close to retirement age as sufficient reason to throw in the towel. Similarly many spouses are discovering they prefer being at home in a supporting role after having experienced it thanks to pandemic related layoffs. Anxiety over family formation (the lack thereof) likely factors into this decision to an extent.

Like the mainstream narrative as to why the shortage was incorrect, the advocated solution to the demographic problem is also incorrect. While increased immigration will provide a fresh supply of those ignorant of the reasons Americans are reticent in dealing with US firms, this is not a sustainable solution. The internet has massively increased the speed with which immigrants are stripped of their illusions regarding the "American Dream". Many immigrants deeply resent the sword of Damocles that green cards represent and they inevitably learn the reality of the American workplace as well.

Nevertheless, supposing an unlimited supply of skilled labor willing to immigrate (which upon reflection is actually quite a dubious assumption) the can could be kicked down the road indefinitely. However there has been no meaningful increase in immigration at present, which means employers must take concrete action now or accept understaffing. This means any change in immigration policy is unlikely to happen, as it will be "too little, too late" for the vast majority of firms.

Expansion of Entrepreneurship

One other way in which people are "taking their ball and going home" is striking out on their own, much as I have. The 49% increase in EIN applications is strong evidence many are doing so. I have been surprised for years that more didn't recognize the strength of contracting earlier. The tax advantages are quite strong and automation removes much of the need for support staff. Nevertheless now that people are taking the plunge, this is removing a significant number of people from the employment rolls permanently.

Missionary or Mercenary? 🔗

1634842858

🏷️ corporate 🏷️ entrepeneurship

As part of this transition to entrepreneurship, I've talked to a lot of recruiters and companies in attempts to get contracts. Of those successful, there is almost always some element of a bait-and-switch involved either with the actual duties required or payment offered versus the expectations discussed up-front.

I don't take any of it personally, sometimes it's just a negotiation tactic you have to withstand. It is nevertheless a black mark on the relationship going forward, as it's assured they'll inevitably find some way to short you subsequently. If you are good at sticking to your guns, the place they inevitably slip is being late with payment.

What baffles me is how many employees I've known throughout the years who actually accepted an offer after such an obvious bait-and-switch without holding them to initial expectations. When I talked to them about it, their rationalization of the situation was inevitably a covert contract along the lines of "they'll do it when I prove myself". This of course was an axe ground forever in secret when that never came to pass. Such poison is everywhere when you know how to look for it in most firms.

Modern employment seems built around fostering these sorts of covert contracts. This is a side-effect of managers wanting missionaries not mercenaries. The trouble is that this vision of the firm is inevitably undermined by the incompetence of their self-serving management. It can't be any other way because the phenomenon of the "single-elimination ass-kissing tournament" and it's effects described in "Moral Mazes" saturate the market in the USA totally. As such employees regularly care deeply about firms that are at best indifferent to their well being. In that case being a "missionary" feels less inspiring and more like wearing matching nikes and drinking kool-aid. Missionaries tend to get disillusioned when they realize those which they follow are not gods, just men.

Which brings us to today, where we have many firms lamenting a labor shortage. This should shock nobody paying attention to demographics. Over the last 60 years firms have enjoyed an unprecedentedly glutted labor pool thanks to both the baby boom and feminine empowerment. The reality going forward is instead a diminishing labor pool. Yet firms still regularly adjust their final offers down and withhold Hiring "bonuses" until long after the initial work. They then have the gall to wonder why they're seeing a lack of enthusiasm.

Firms will not be able to get away with the sort of chicanery which has been commonplace in the last 60 years. Both wages and the meaningfulness of jobs will have to actually adjust higher. This is bad news for the management of most firms, as they've grown fat glad-handing and hiring armies of suck-ups which make them look good without being productive.

This is ultimately a hopeful sign for the future versus our reality of the last 20 years where we have had the best engineers ever lead by the worst management. We're already seeing huge levels of attrition from firms which have clueless management. It is but a matter of time before people look around and see that we have had the answers all along, but are blind to them on a systemic level.

It would be a breath of fresh air to be able to deal with management on a professional level rather than have to engage in a guerrilla insurgency of "Fuhren unter Der Hand" to achieve the actual goals of companies. I have grown weary of having to cynically exploit middle management's impulse to look good at all costs as the way to advance in a firm rather than actually accomplishing something. That said, the army has known about the superiority of auftragstrategik, the OODA loop and ideas similar to the germ theory of management for even longer than the business community has, yet remains the poster child for the kind of management described in "moral mazes". The capacity for self-deception in managers clearly is at least as great as those who work for them.

This is perhaps the greatest reason I've left it all behind me to do this hired-gun thing. At least this way I'm not holding my breath that this entrenched set of problems gets fixed.

PTR Records: waving a dead chicken 🔗

1634239190

🏷️ ipv6 🏷️ dns 🏷️ mail 🏷️ spam

Helping out people with their shared hosting problems inevitably runs into mail issues. One of the more common ones has to do with being put on an RBL, usually a code 550 bounce. This usually happens for one of three reasons:

The mailserver is configured to do what seems right and HELO per the appropriate domain rather than the domain of the server
The mailserver responds on an IP other than that of the domain returned by the HELO (either lack of a PTR record, or responding on the mail domain's IP)
The client is actually spamming

It's rarely the latter, as this is fairly straightforward to prevent with outgoing mail limits, outgoing scanning and so forth.

You'd think that this cross-referencing of a domain's A record to it's PTR would prevent a lot of spam, as this would effectively prevent spoofing. Unfortunately the reality of shared hosting, IPv4 scarcity and limited ISP tooling has crippled this. This has resulted in a reality where there are only 2 correct configurations:

No sharing of IPs between domains which emit mail, period. Not even CNAMEs. You can then HELO per domain on it's IP.
Never HELO with anything other than the server hostname, and always respond from the server's IP

As you can see allowing the latter (which all RBLs must, as it's the Lowest common denominator here) essentially cripples the cross-referencing of the sender domain's A record with it's IP and the corresponding PTR record. Spammers can spoof any domain they wish as long as their MTA HELOs correctly until they get manually reported.

So why is it that we can't share IPs and HELO per-domain/ip? It's precisely because we are doing this PTR to IP to A record cross-referencing as an automated process over at the RBL makers like spamcop et al. That, and almost nobody is ever given a /24 block of IPs (0-254 on the least significant byte of the address). Instead the ISPs usually assign smaller blocks to multiple customers. This means that they can't delegate the NS record for *.X.Y.Z.in-addr-arpa (where XYZ is the remainder of your IP, but backwards) to their client.

So instead they provide some kind of interface for their clients to add PTR records. Unfortunately many don't know that a one-to-many PTR relationship is entirely supported by DNS, much like for A records themselves. As such these interfaces inevitably only allow one domain to be specified per IP. Which leaves you with only one option: send out mail from your server's hostname and primary IP.

Meanwhile, 0 spam is prevented because of this massive hole blown in the entire scheme. The only practical outcome is that unaware sysadmins get caught up by this and are put on RBLs erroneously. Which is a pity, as if the system actually worked correctly it would be an ironclad guarantee against spoofed emails.

IPv6 could have fixed this, as we could give everyone a /64 IP range until the end of time and never run out. Delegating NS for the PTR domains could then become standard practice. IPv6 never really got adopted by the major ISPs though. Given they haven't updated their tooling for multi-PTR (which has been supported almost since the beginning of DNS), we shouldn't hold our breath that NAT is going away anytime soon either.

2 Major paths to take advantage of CDNs 🔗

1634143208

🏷️ tcms 🏷️ www

When considering a switch from traditional web hosting to something like S3 (or ipfs) plus web workers or equivalent stateless compute resources (GKP, Lambda, etc) you need to re-think how you deploy your applications. Many of the complex backends that you are used to are simply not feasible to have on the live web with this model. That said, it's not hard to write CGIs in a stateless fashion, and many only keep their state client side or in a database, which is a model that works just fine here.

That said, most of your backend probably doesn't even need to be public. It could easily be behind a firewall and just push updates to your staging and production buckets and workers. In fact, I'm considering this model for tCMS right now! There's no good reason why I need the login functionality public, at least for post editors -- all of them could run a local instance and push out updates supposing I had an rqlite backend data storage sitting in a bucket as a read only node, with the editors behind firewalls doing the RAFT thing.

From there the question is whether to do static generation or load sqlite with sql.js and do everything client-side (or a combination of both). Static versus dynamic web pages is well trod ground at this point so do what feels best for your application. Personally I will probably go with static generation as I prefer to run as little as possible on the client machine to get the best user experience.

The big drawback to using IAAS is (historically) higher TCO and a lack of good abstractions making it easy to self-host under such models. Things like OpenStack make self-hosting possible, but prohibitively expensive and it feels like swatting flies with an elephant gun for most webmasters.

Even containerization and it's orchestration tools require a lot of extra thought and code versus the traditional LAMP approach. What's really needed to swim in both pools easily are some abstractions to make it easy for you to:

Treat local directories like they are S3 buckets (and/or ipfs tokens) so you have flexibility for storage deployment
Treat local httpds like they're a web worker API you deploy to, rather than vhosts

Implementing such tools would effectively allow any shared host to transition their existing infrastructure into such products, albeit without automatic scaling on their end. That said, if multiple servers are supported in a sort of federated model (again, likely mediated thru rqlite), that would largely ameliorate such concerns.

For S3, s3proxy is exactly what we are looking for. As to an imitation of web workers (or other "stateless" compute resources), that would be far more complex given there is no standardization across vendors.

Dealing with out of date slave zones after an IP migration 🔗

1633971893

🏷️ rndc 🏷️ dns

I had a problem the other day where I did an IP migration on a server which was shipping slave zones to another server. Unfortunately, the IP migration script provided with plesk either failed to update the zone serials, or BIND9/rndc just didn't care for whatever reason. As such, the other nameserver kept returning responses with the old IPs, despite:

Restarting bind, reloading rndc on both
incrementing every zone serial on the zones we were authoritative for using sed
Manually wasting the zonefile on the slave, then running rndc retransfer and rndc resync on the master
rndc delzone of said slave, and then try retransfer/resync again

Nothing actually brought over the new zonefile. Obviously, more rigorous means of forcing the issue were required.

First, I needed to see if the zones on the slave were not updating, or if I was nuts. Unfortunately rndc showzone tells you next to nothing other than that a zone exists and where it comes from. This means I need to convert them to text, as the slave zones by default are shipped in bind's Binary format. To make a more generic alias, I've made a means to detect whether or not it's binary and "do the right thing".

showzone.source

showzone()
{
    echo ${1?Error domain name is not defined e.g. showzone test.test }
    # Update the following as appropriate to your environment, this works on plesk
    ZONE_LOC=/var/named/chroot/var

    file $ZONE_LOC/$1 | grep text &> /dev/null
    RESULT=$?
    if [ $RESULT == 0 ]
    then
        cat $ZONE_LOC/$1
    else
        /usr/sbin/named-compilezone -f raw -F text -o - $1 $ZONE_LOC/$1
    fi
}

The missing link here was doing rndc addzone in the same way it would happen automatically, to fool the system into thinking it was a brand new slave configured:

fix_borked_slaves.sh

#!/bin/bash

ZONE_LOC=/var/named/chroot/var

# Run this On slave:
for zone in $(ls $ZONE_LOC)
do
    if [ -z `rndc showzone $zone 2>/dev/null` ]
    then
        # Grab rndc's current definition
        TO_ADD=$(rndc showzone $zone 2>/dev/null | grep slave | awk '{ $1=$2=$3=""; print $0}')
        # Dump the current list for use on master
        echo "$zone" >> for_master_update.txt
        # Delete and re-add the zones
        rndc delzone $(rndc showzone $zone 2>/dev/null | grep slave | awk '{ print $2 }')
        rndc addzone $zone $TO_ADD
    fi
done

# Preamble to script you need to run on master, based on for_master_update.txt

# ADAPTER=eth0
# LOCAL_IP=$(ip addr show $ADAPTER | grep 'inet ' | awk '{print $2}' | sed -e 's/\/.*$//g')
# REMOTE_IP="REPLACEME"

# On REMOTE_IP, do a mass search/replace to make for_master_update.txt look like this and run it

#rndc -b $LOCAL_IP -s $REMOTE_IP -p 953 -y rndc-key retransfer $zone

This thorough application of the LART did the trick. I'm still not 100% sure what was BIND and rndc's problem here; bumping zone serials should always result in a transfer.

What would be better?

Nonsense like this is probably why cPanel's "DNS Clustering" never used afxrs or master/slave DNS in the first place. Eventually people liked it's enabling of a multi-master setup where you could edit the zones from anywhere in your cluster. It was of course a pretty funky system in that it had no consensus protocol whatsoever. Edit loops and Sibyl attacks were entirely possible and straightforward for customers to mistakenly produce for years because of confusing documentation. The documentation is better now, but the design of the system haven't fundamentally changed.

I was part of the team tasked with considering redesign of that system when we had to add DNSSEC support to it. Even the supermaster mode in pdns doesn't have the same strength as the RAFT protocol. As such I think the ideal setup for that is probably pdns in sqlite mode, utilizing RQlite. This has the added benefit of being simple for people to just setup themselves.

MySQL utf8 migrations: big hairy queries 🔗

1633018799

🏷️ mysql 🏷️ utf8

I had to migrate some mysql databases to UTF8mb4 last week. You would think this all would have happened a decade ago in a sane world, but this is still surprisingly common thanks to bad defaults in packaged software. Every day there's a new latin1 db and table being made because /etc/my.cnf isn't setup right out of the box. So People google around, and find some variant of this article.

I wasn't willing to waste hours with a non-automated solution for the many tables and databases piled up over years. So, I wrote a script of course. The plan was this:

Convert the DB's default encoding & collation
Get the list of tables so we can iterate
Try and convert the table. If that fails, it's usually due to long varchars being indexed.
On failure, drop and re-add the indices on long varchars with an acceptable shorter prefix length
After that, try and re-convert the table

This ended up being a 99% solution. A couple of tables had some neat problems I'll talk about at the end. So without further ado, I'll talk about the queries themselves.

Converting the DB is straightforward.

 fix_db.sh
mysql -e "ALTER DATABASE $DB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci"

We'll assume for these examples that $DB is $1 (the first argument to the script).

Similarly, listing and converting tables was easy:

 fix_tables.sh ~ snippet 1
for table in $(mysql $DB -ss -e 'show tables')
    do
        mysql $DB -ss -e "ALTER TABLE $table CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci" &> /dev/null
        STATUS=$?
        if [ $STATUS -ne 0 ]
        then
            ...

Now we have to do the hard part of dropping and re-adding the indices, because mysql doesn't support ALTER INDEX like postgres.

This means we need to capture the index state as it currently exists and only make the needed adjustments to turn varchar indexes into prefix indices. If you didn't read the article linked at the top, this is because utf8mb4 strings are 4x as large as latin1, and mysql has a (mostly) hardcoded index member size. You can get it bigger on InnoDB for most use cases, but in this case it was MyISAM and no such option was available.

Here's how I did it:

 fix_tables.sh ~ snippet 2

    for query in $(mysql -ss -e "select CONCAT('DROP INDEX ', index_name, ' ON ', table_name) AS query FROM information_schema.statistics WHERE table_name = '$table' AND index_schema = '$DB' GROUP BY index_name HAVING group_concat(column_name) REGEXP (select GROUP_CONCAT(cols.column_name SEPARATOR '|') as pattern FROM information_schema.columns AS cols JOIN information_schema.statistics AS indices ON cols.table_schema=indices.index_schema AND cols.table_name=indices.table_name AND cols.column_name=indices.column_name where cols.table_name = '$table' and cols.table_schema = '$DB' AND data_type IN ('varchar','char') AND character_maximum_length >= 250 AND sub_part IS NULL);")
    do
        echo "mysql $DB -e '$query'"
    done

    for query in $(mysql -ss -e "select CONCAT('CREATE ', CASE non_unique WHEN 1 THEN '' ELSE 'UNIQUE ' END, 'INDEX ', index_name, ' ON ', table_name, ' (', group_concat(concat(column_name, COALESCE(CONCAT('(',sub_part,')'), CASE WHEN column_name REGEXP (select GROUP_CONCAT(cols.column_name SEPARATOR '|') as pattern FROM information_schema.columns AS cols JOIN information_schema.statistics AS indices ON cols.table_schema=indices.index_schema AND cols.table_name=indices.table_name AND cols.column_name=indices.column_name where cols.table_name = '$table' and cols.table_schema = '$DB' AND data_type IN ('varchar','char') AND character_maximum_length >= 250 AND sub_part IS NULL) THEN '($PREFIX_LENGTH)' ELSE '' END)) ORDER BY seq_in_index), ') USING ', index_type) AS query FROM information_schema.statistics WHERE table_name = '$table' AND index_schema = '$DB' GROUP BY index_name HAVING group_concat(column_name) REGEXP (select GROUP_CONCAT(cols.column_name SEPARATOR '|') as pattern FROM information_schema.columns AS cols JOIN information_schema.statistics AS indices ON cols.table_schema=indices.index_schema AND cols.table_name=indices.table_name AND cols.column_name=indices.column_name where cols.table_name = '$table' and cols.table_schema = '$DB' AND data_type IN ('varchar','char') AND character_maximum_length >= 250 AND sub_part IS NULL);")
    do
        echo "mysql $DB -e '$query'"
    done

We then execute all the mysql statements. I output this to a shell script which can then be run. If we simply ran them, we would have an issue where the indexes we are looking for have already been dropped.

Let's go over the details, as it uses a solid chunk of the important concepts in mysql. Read the comments (lines beginning with --) for explanations.

 drop_indices.sql

-- The plan is to output a DROP INDEX query
SELECT CONCAT('DROP INDEX ', index_name, ' ON ', table_name) AS query
-- information_schema.statistics is where the indices actually live
FROM information_schema.statistics
WHERE table_name = '$table'
AND index_schema = '$DB'
-- The indices themselves are stored with an entry per column indexed, so we have to group to make sure the right ones stick together.
GROUP BY index_name
-- We want to exclude any indices which dont have VARCHAR or CHAR cols, as their length is irrelevant to utf8mb4
-- The simplest way is to build a subquery grabbing these, and then scan the columns in the index via regexp
HAVING group_concat(column_name) REGEXP (
    select GROUP_CONCAT(cols.column_name SEPARATOR '|') AS pattern
    -- We need to cross-reference the cols in the index with those in the table
    -- So that we can figure out which ones are varchars/chars
    FROM information_schema.columns AS cols
    JOIN information_schema.statistics AS indices ON cols.table_schema=indices.index_schema
    AND cols.table_name=indices.table_name
    AND cols.column_name=indices.column_name where cols.table_name = '$table'
    AND cols.table_schema = '$DB'
    AND data_type IN ('varchar','char') AND character_maximum_length >= 250
    -- sub_part is the prefix length.  In practice, any ones with a sub part already set were just fine, but YMMV.
    AND sub_part IS NULL
);

Things get a little more complicated with the index add. To aid the reader I've replaced the subquery used above (it is the same below) with "$subquery".

 make_indices.sql

-- As before, we wish to build a query, but this time to add the index back properly
SELECT CONCAT(
    'CREATE ',
    -- Make sure to handle the UNIQUE indexes
    CASE non_unique WHEN 1 THEN '' ELSE 'UNIQUE ' END,
    'INDEX ',
    index_name,
    ' ON ',
    table_name,
    ' (',
    -- Build the actual column definitions, complete with the (now) prefixed indices
    group_concat(
        concat(
            column_name,
            -- COALESCE is incredibly useful in aggregate functions, as the default behavior is to drop the result in the event of a null.
            -- We can use this to make sure that never happens when sub_part IS NULL.
            -- In that case, we either print a blank string or a prefix based on what I chose to be $PREFIX_LENGTH (in my case 50 chars) when varchar/char cols are detected by the subquery.
            COALESCE(
                CONCAT('(',sub_part,')'),
                CASE WHEN column_name REGEXP ($subquery) THEN '($PREFIX_LENGTH)' ELSE '' END
            )
        )
        -- Very important to preserve the sequence in the index, these are scanned before the latter ones and can have big perf impact
        ORDER BY seq_in_index
    ),
    ') USING ',
    -- Almost always BTREE, but might be something else if we are for example using geospatials
    index_type
) AS query
-- The rest is essentially the same as the DROP statement
FROM information_schema.statistics
WHERE table_name = '$table'
AND index_schema = '$DB'
GROUP BY index_name
HAVING group_concat(column_name) REGEXP ($subquery);

As you can see, this uses nearly every trick in the book. The only improvements I could think of would be to turn the subquery into a VIEW, as it was used multiple times. Turning the create statement generator without our constraints into a VIEW is generally useful and left as an exercise for the reader.

About the only major feature I didn't use were stored procedures or extensions. This all would have been a great deal simpler had ALTER INDEX been available as in postgres. In sqlite it would be a great deal more complicated save for the fact that this isn't a problem in the first place, given utf8 is the default.

The only places this didn't work out was when tables crashed on things like SELECTs of particular rows for whatever reason. The good news was that mysqldump generally worked for these tables, and you could modify the dump's CREATE TABLE statement to setup indices properly. There were also a few places where there were very long varchars that were *not* indexed, but made the table too large. Either shortening them or turning them into LONGTEXT was the proper remedy.

What's new in playwright-perl v0.015 🔗

1631664613

🏷️ video 🏷️ testing 🏷️ playwright

Quick update on what's changed since I last talked about playwright.

tCMS September 2021 update 🔗

1631660780

🏷️ video 🏷️ tcms

I finally got time to do some work on tCMS! Here's the new goodies all the users of tCMS can expect going forward.

tCMS September 2021 update 🔗

1631660718

🏷️ video 🏷️ tcms

I finally got time to do some work on tCMS! Here's the new goodies all the users of tCMS can expect going forward.

Games people play: Ultimate Smackdown 🔗

1629390195

🏷️ kayfabe 🏷️ branding

Mark Gardner asked me on twitter the other day why I wrote this article without specifically naming any names. I realized my reasoning here is probably worth sharing and elaborating if the question needs to be asked.

A lot of people would naturally think most of the post was about the perl community given that's sort of what this series of posts has been about up to this point. While it does indeed apply in large part, it's far from the only place I've seen these phenomena, and a general treatment will naturally be more useful. I've also personally "been the bad guy" here myself, so I can't throw the first stone.

People don't usually read blogs because they're useful. It's all about getting engagement these days which gets down to the more important reason to not "name names" unless the situation described is already well understood in the public. When you do, your brand immediately becomes pro wrestling and you're either gonna be a heel or a face.

This is also why calumny is considered sinful, as it provokes strong emotions which tend to make people do things they regret. When you decide to participate in an online kayfabe, don't be shocked when you get the wrong kind of followers. Sometimes the drooling fans are more annoying than the online lynch mob.

Beefin' is generally not what you want to see from a professional offering actual goods or services. Do you want to sell software and services, or entertainment? You generally don't get to do the same within one brand. There's a reason my political/entertainment persona online is quite separated from my professional. While I don't try to hide that it exists, the degree of separation is an important signal that you can, in fact, control yourself when it comes to what's actually important -- making an upright living through service to others.

A lot of this first year of entrepreneurship has been de-programming myself and unlearning bad corporate habits I didn't even know I had. Much of my content up to now has been little more than sharing my journey. It's been useful to some people, but nowhere near as useful as my actual skill set has been to my clients. Sure feels good to write though.

Resizing, Expanding and Using raw images 🔗

1629321079

🏷️ tutorial 🏷️ kvm

So you need to resize some VM images, and possibly inspect them later. Here's the dope:

 mount_loopback.sh
# Replace size with what you see to be appropriate
qemu-img resize $IMAGE_FILE +80GB
# Setup a device for the image file so that we can resize the relevant partition
losetup -f -P $IMAGE_FILE

When using LVM, you'll have to do some extra steps to get the relevant device and resize things:

 get_mapped_volname.sh
pvscan --cache
# Grab the name of the volume
lvs
# Use the name from that to plug it in, there should be something similarly named in /dev/mapper
vgchange -ay

Now you need to resize the partition:

 resize_normal.sh
# When you have a normal DOS/BSD partition table:

# Using ext fS:
resize2fs /dev/loop0p1

# Basically anything else
growpart /dev/loop0 1

Things are of course different on LVM:

 lvm_extend.sh
# Resize partition you want, usually this is device 1, but on XFS the data section is usually device 2
pvresize /dev/loop0p2
lvextend -l +100%FREE /dev/mapper/$NAME

XFS also requires special handling. You will have to mount the relevant device and then:

 xfs_grow.sh
# Or, the loop device if it's not LVM
xfs_growfs /dev/mapper/$NAME

Mounting and fiddling with stuff is as simple as:

 mount_loopbacks.sh
# Replace partition number with whatever is appropriate
mount -t $FSTYPE /dev/loop0p1 $MOUNTPOINT
# Same story with LVM stuff, but use the /dev/mapper entry
mount -t $FSTYPE /dev/mapper/$NAME $MOUNTPOINT

When done, you should remove the loopback device:

 remove_loopbacks.sh
losetup -d /dev/loop0

One last point worth noting is that if you do this while the VM is active writes will not show up until you unmount & unload the loopback devices, as they don't share journals. The best use of mounted VM disks on the hypervisor is for backups though, and for this purpose loading and mounting them just for the backup period works quite well. As such I generally also add the -r (read only) option to losetup when I'm mounting for backup purposes.

Troglodyne year 1 in review 🔗

1629315907

🏷️ tcms

About halfway into my year of doing this solo entrepeneur thing, I realized a lot of my work on tCMS was not done out of a desire to outdo wordpress, ghost or any of the other CMSes which are a part of this cambrian explosion of software most of us have lived through.

Instead, it was actually done for the same reason carpenters build their own house. By god, I'm gonna do it the way I want it for once! What you get is indeed quite satisfying. Though when you zoom out and think of the long term perspective, does it actually mean as much as I feel it does? After all, generations of my ancestors built their own homes and barns. Those now living in them (if they weren't razed) now have no idea what went into it or why it was built that way.

It will be the same with software and the brands and businesses built around them. Like with houses, the only ones that will remain standing will largely be a function of what particular families, towns, firms and industries managed to stick around. As such the "0 code" grifters, for all their embarrassing obviousness are essentially right when they focus almost exclusively on building their customer pipelines.

Now that I'm out in the world of general contracting, I actually see this everywhere. The biggest and most successful businesses tend to run lean and hard on their creaking and ancient facilities be they real or virtual. Even obsolete software, hardware and real estate work just fine, and usually with great margins now that they've long outlived their depreciation curve.

What I'm trying to say here is yet another reason to not get too wrapped up in your tech. So what if it's a mountain of garbage? Plenty of money to be made mining that heap! Using a dead language? Necromancy tends to have pretty high margins! Lots of people make their livings with run-down trucks and dilapidated real estate.

Which is ultimately why I'm sticking with my little CMS. Sure it's using Perl, and probably an evolutionary dead end as far as CMSes go. But it's mine, and at the end of the day you have to live like nobody else to live like nobody else. As long as it delivers where I need it to, I'm not gonna sweat about the future. Having seen several people build successful business with worse tech up close and personal, I'm confident that I can actually build a business atop this little house for my data.

I'm quite blessed to have had good advice, prudent planning, discipline and a patient business partner which has allowed me the ability to putter around until I figured out how all this works. I'm grateful for the clients I've had up to this point and their ongoing custom. I think this next year I'll be able to finally add in a software offering of my own.

I'm also quite happy with how well I've done keeping up with my open source projects. Now that it looks like I've actually got a credible hourly rate I'm beginning to wonder if setting up a charitable OSS foundation (or getting sponsorship from something existing) so I can use this pro-bono time as a writeoff will make sense in the future. I'll have to look into this, and hopefully can get a good article and video on the subject in the future.

The reality of banality: yet more lessons from a career in QA 🔗

1629266857

🏷️ oss 🏷️ moderation

A great deal of the conflict in online messaging software and social networks revolves around the idea that people should conduct themselves "better" as though that were in fact possible. A thorough reading of history will make you realize that Sturgeon's Law applies equally to interactions you have with others. When have we ever not had depraved maniacs for elites, mobs of raving heretics spreading all manner of nonsense, attention seekers and every other kind of nuisance which we are currently beset by? I don't think it's ever happened for more than brief stretches of time.

Layered atop this reality is the massively perverse incentives of the "social graph". Measuring engagement is essentially a flow meter on a sewer pipe. The fact that nobody has figured out anything better than this grotesque hack to produce relevant searches is a testament to the reality of our natures and desires.

As such, the idea that codes of conduct or "Zero tolerance" policies could even begin to address deficiencies in public discourse is beyond ludicrous. This has important implications for things such as open source projects and social networks. The more they commit to openness, the more "toxic" the discussion is guaranteed to become, as this necessarily means not filtering out 90% of the possible inputs for the simple reason that they add nothing substantive.

This is why projects with BDFLs (Benevolent Dictator For Life) actually tend to work. Maintainership almost always goes hand in glove with deleting clowns from your tracker, message boards and mailing lists. On the other hand, when you have a nebulous "open" means by which authority in a project is acquired, any fool can make a run at the crown and as such will make effort to be heard.

At that point, whatever group in charge has two choices:

Ignore the rabble
Engage with the rabble until overwhelmed by sheer numbers

As such it should shock no one at all why projects run by nebulous groups of quasi-nobility tend to turn out the way they do. It is an inevitability that the group's legitimacy is questioned if for no better reason than it has a larger attack surface (again, Sturgeon's law means most of your ruling council will be of dubious quality).

On the other hand, the BDFL is indivisible, and bad ones don't get projects off the ground at all. The normal selection mechanism of the Bazaar helps us here. The attack surface is minimal, and discourse is likely to be healthy.

The question then arises, Why does the BDFL always step down? Heavy is the head that wears the crown. Scaling projects is hard. Many people use the tricks of modularization and shoving as much code into data as possible, but won't go all the way and release control, making the breaking of the monolith academic at best.

Done properly this resembles a feudatory arrangement in the classical sense. Many overlapping claims to particular subsystems, but in general one overarching BDFL for each, reporting up the stack to the ur-BDFL of the project. This is actually a pretty good description of how the linux kernel's development works right now. I hope that like the Good Emperors they choose successors wisely rather than leaving the title(s) up for grabs.

Which brings me to the point of all this. The reason this arrangement works is because it is naturally efficient to have people become experts in the systems they work on rather than generalists who know nothing about everything. Even studies back this up. So why would anyone want to mess up a good thing over little things like them being wicked sinners and conforming with Sturgeon's Law in the other 90% of their life?

Because we think things can be different, despite millenia of evidence to the contrary. Human situations change, but their nature does not. Even the idea that "we want better communities" is usually just a cope masking a lust for vengeance over personal slights.

Even in those rare moments of greatness, we have to know mean reversion is right around the corner. But take heart! In the worst of situations you can also be sure a return to mediocrity can't be far off. So rather than lament our lot, why not embrace it? This is what I mean when saying "It's better to be happy than right".

Sure you're gonna work and talk with some absolute toads. So what? Get over yourself, and throw your inner child down a well (it's OK, Lassie will eventually rescue it). Getting emotional about it never accomplished anything.

Here's the practical advice. In a technical project, you will have to deal with a mountain of BS from validation seekers. This is the ultimate motivation of both the holy crusader and the troll and neither should be tolerated unless you enjoy watching your fora transform into a river of sewage. These messages can easily be spotted because they don't actually contribute anything concrete. Reward good behavior by responding promptly to actual contributions, and leave everything else "on read".

Eventually long-time contributors (and users) get emotionally invested. This is where most of the crusaders actually come out of, not realizing they're playing a validation seeking game. Gee, what if this project I invested all this time in is actually bad? Does that mean I'm bad? Impossible! It's all these heretics...

Remain vigilant against the urge to get emotionally invested in your tools and organizations. Sturgeon's Law holds and mean reversion will eventually happen. If you can't move on, you eventually become a fool screaming at inanimate objects and attacking phantoms of your own imagining.

Games people play on P5P: Part Deux 🔗

1629136448

🏷️ perl 🏷️ p5p 🏷️ lolcows

Since I posted about the resignation of SAWYERX, even more TPF members have thrown in the towel. This doesn't shock me at all, as the responses to my prior post made it clear the vast majority of programmers out there are incapable of seeing past their emotional blinders in a way that works for them.

The latest round of resignations comes in the wake of the decision to ban MST. The Iron Law of Prohibition still holds true on the internet in the form of the Streisand Effect, so it's not shocking that this resulted in him getting more press than ever. See above Image.

I'll say it again, knowing none will listen. There's a reason hellbans exist. The only sanction that exists in an attention economy such as mailing lists, message boards and chatrooms is to cease responding to agitators. Instead P5P continues to reward bad behavior and punish good behavior as a result of their emotional need to crusade against "toxicity". Feed trolls, get trolls.

This is why I've only ever lurked P5P, as they've been lolcows as long as I've used perl. This is the norm with programmers, as being right and happy is the same thing when dealing with computers. You usually have to choose one or the other when dealing with other people, and lots of programmers have trouble with that context switch.

So why am I responding to this now? For the same reason these resignations are happening. It's not actually about the issue, but a way to ~~get attention~~ raise awareness about important issues. Otherwise typing out drivel like this feels like a waste of time next to all those open issues sitting on my tracker. This whole working on computers thing sure is emotionally exhausting!

Getting Features shipped in the face of resistance 🔗

1628695736

🏷️ wwic 🏷️ corinna 🏷️ perl

The big tempest in a teapot for perl these days is whether OVID's new Class and Object system Corinna should be mainlined. A prototype implementation, Object::Pad has come out trying to implement the already fleshed out specification. Predictably, resistance to the idea has already come out of the woodwork as one should expect with any large change. Everyone out there who is invested in a different paradigm will find every possible reason to find weakness in the new idea.

In this situation, the degree to which the idea is fleshed out and articulated works against merging the change as it becomes little more than a huge attack surface. When the gatekeepers see enough things which they don't like, they simply reject the plan in toto, even if all their concerns can be addressed.

Playing your cards close to your chest and letting people "think it was their idea" when they give the sort of feedback you expect is the way to go. With every step, they are consulted and they emotionally invest in the concept bit by bit. This is in contrast to "Design by Committee" where there is no firm idea where to go in the first place, and the discussion goes off the rails due to nobody steering the ship. Nevertheless, people still invest; many absolute stinkers we deal with today are a result of this.

The point I'm trying to make here is that understanding how people emotionally invest in concepts is what actually matters for shipping software. The technical merits don't. This is what the famous Worse Is Better essay arrived at empirically without understanding this was simply a specific case of a general human phenomenon.

Hand me the MOP! 🔗

1628189700

🏷️ perl 🏷️ oo

Reading Mark Gardner's latest post on what's "coming soon" regarding OO and perl has made me actually think for once about objects, which I generally try to avoid. I've posted a few times before that about the only thing I want regarding a new object model is for P5P to make up it's mind already. I didn't exactly have a concrete pain point to give me cause to say "gimme" now. Ask, and ye shall receive.

I recently had an issue come down the pipe at playwright-perl. For those of you not familiar, I designed the module for ease of maintenance. The way this was accomplished is to parse a spec document, and then build the classes dynamically using Sub::Install. The significant wrinkle here is that I chose to have the playwright server provide this specification. This means that it was more practical to simply move this class/method creation to runtime rather than in a BEGIN block. Running subprocesses in BEGIN blocks is not usually something I would consider (recovered memory of ritual abuse at the hands of perlcc).

Anyways, I have a couple of options to fix the reporter's inability to subclass/wrap the Playwright child classes:

Run the subprocess in BEGIN to grab the spec from the playwright_server
install all the Moo stuff with Sub::Install as well
Throw in the towel on runtime meta and rely on compile-time code generation

I could of course abandon object orientation entirely as well. The object model is familiar to users of Selenium::Remote::Driver (which is pretty much where all my user base is coming from), so that's probably not a great idea.

On the other hand, if we had a good "default mop" like Mark discusses, this would be a non-issue given we'd already get everything we want out of bless (or the successor equivalent). It made me realize that we could have our cake and eat it in this regard by just having a third argument to bless (what MOP to use).

perl being what it is though, I am sure there are people who are in "bless is bad and should go away" gang. In which case all I can ask is that whatever comes along accommodates the sort of crazed runtime shenanigans that make me enjoy using perl. In the meantime I'm going back to compile-time metaprogramming.

Dependencies: It depends 🔗

1627440954

🏷️ linking 🏷️ dependencies

Over the years I've had discussions over the merit of various forms of dependency resolution and people tend to fall firmly into one camp or the other. I have found what is best is very much dependent on the situation; dogmatic adherence to one model or another will eventually handicap you as a programmer. Today I ran into yet another example of this in the normal course of maintaining playwright-perl.

For those of you unfamiliar, there are two major types of dependency resolution:

Dynamic Linking: Everything uses global deps unless explicitly overridden with something like LD_LIBRARY_PATH at runtime. This has security and robustness implications as it means the program author has no control over the libraries used to build their program. That said, it results in the smallest size of shipped programs and is the most efficient way to share memory among programs. Given both these resources were scarce in the past, this became the dominant paradigm over the years. As such it is still the norm when using systems programming languages (save for some OSes and utilities which prefer static linking).

Static Linking: essentially a fat-packing of all dependencies into your binary (or library). The author (actually whoever compiles it) has full control over the version of bundled dependencies. This is highly wasteful of both memory and disk space, but has the benefit of working on anything which can run its' binaries. The security concern here is baking in obsolete and vulnerable dependencies. Examples of this approach would be most vendor install mechanisms, such as npm, java jars and many others. This has become popular as of late due to it being simplest to deploy. Bootstrapping programs are almost always using this technique.

There is yet another way, and this is to "have our cake and eat it too" approach most well known in windows DLLs. You can have shared libraries which expose every version ever shipped such that applications can have the same kind of simple deploys as with static linking. The memory tradeoffs here are not terrible, as only the versions used are loaded into memory, but you pay the worst disk cost possible. Modern packaging systems get around this by simply keeping the needed versions online, to be downloaded upon demand.

Anyways, perl is a product of it's time and as such takes the Dynamic approach by default, which is to install copies of its' libraries system-wide. That said, it's kept up with the times so vendor installs are possible now with most modules which do not write their makefiles manually. Anyhow, this results in complications when you have multi-programming language projects, such as playwright-perl.

Playwright is a web automation toolkit written in node.js, and so we will have to do some level of validation to ensure our node kit is correct before the perl interface can work. This is further complicated by the fact that playwright also has other dependencies on both browsers and OS packages which must be installed.

Normally in a node project you could use webpack to fat-pack (statically link) to all your dependencies. That said, packing entire browser binaries is a bridge too far, so we either have to instruct the user properly or install everything for them. As is usual with my projects, I bit off a bit more than I can chew trying to be good to users, and made attempts to install the dependencies for them. Needless to say, I have ceased doing so. Looking back, this willingness to please has caused more of my bugs than any other. Yet again the unix philosophy wins; do one thing, do it well. This is also a big reason why dynamic linking won -- it makes a lot of things "not your problem". Resolving dependencies and installing them are two entirely separate categories of problem, and a program only has to solve one to run.

The long-term solution here is to wait until playwright can be shipped as an OS package, as many perl libraries are nowadays. It's interesting that playwright itself made an install-deps subcommand. I hope this means that is in the cards soon, as that's most of the heavy lifting for building OS packages.

Reflections on a decade testing software 🔗

1626305255

🏷️ QA 🏷️ testing

The way software testing as a job is formally described is to provide information to decisionmakers so that they can make better decisions. Testers are fundamentally adversarial as they are essentially an auditor sent by management to evaluate whether the product an internal team produces is worth buying.

Things don't usually work out this way. The job is actually quite different in practice from it's (aspirational) self-image. It turns out that the reason testers are not paid well and generally looked down upon in the industry is because of this reality. This is due primarily to the organizational realities of the modern corporation, and is reinforced by various macroeconomic and policy factors. Most of these situational realities are ultimately caused by deeply ingrained emotional needs of our species.

Testing at arm's length: wring this neck

Adversarial processes are not morally or ethically wrong. It is in fact quite useful to take an adversarial approach. For example, AI researchers have found that the only reliable process to distinguish lies from truth in a dataset is precisely through adversarial procedure. However, the usefulness of an adversarial approach is compromised when a conflict of interests exists. This is why Judges recuse themselves from trials in which they even have the appearance of outcome dependence in.

Herein lies the rub. Modern software firms tend to be a paranoid lot, as their (generally untalented and ignorant) management don't understand their software is in no way unique. They seem to act like gluing together 80% open source components is somehow innovation rather than obvious ideas with good marketing. In any case, because of this paranoia they don't want to expose their pile of "innovation" and it's associated dirty laundry to the general public via leaks and so forth. They mistakenly believe that they can secure this most reliably with direct employment of testers rather than being careful with their contractors.

This forgets that the individual employee usually has nothing whatsoever that could be meaningfully recovered in the event of such a breach, and is practically never bonded against this. On the other hand, a contracting business stakes everything on their professionalism and adherence to contract and have far more to lose than a tester paid peanuts. This lead me to the inescapable conclusion: The incentives encouraged for the vast majority of employed testers are the opposite of what is required to achieve the QA mission.

It turns out this happens for the same reason that Judges (paid by the state) don't recuse them from judgement in cases wheir their employer is the defendant or prosecutor. Because the job is not actually what it is claimed to be.

But wait! There's more conflicts of interest!

As if to rub this cognitive dissonance further in the face of the tester, modern organizations tend to break into "teams" and adopt methodologies such as scrum to tightly integrate the product lifecycle. Which means you as a tester now have to show solidarity and be a team player who builds others up instead of tearing down their work. To not do so is to risk being considered toxic.

The only way to actually do this is to prevent issues before they happen, which to be entirely fair is the cheapest point to do so. The problem of course with this is that it means in practice the programmer is basically doing Extreme Programming and riding shotgun with a tester. If the tester actually can do this without making the programmer want to strangle them, this means the tester has to understand programming quite well themselves. Which begs the question as to why they are wasting time making peanuts testing software instead of writing execrable piles of it. I've been there, and chose to write and make more every single time.

How dare you call my baby ugly: Testers are bad people

Everyone who remains a tester but not programmer is forced to wait until code is committed and pushed to begin testing. At that point it's too late from an emotional point of view; it's literally in the word -- commitment means they've emotionally invested in the code. So now the tester is the "bad guy" slipping schedule and being a cost center rather than the helper of the team. That is, unless the tester does something other than invalidate (mistaken) assumptions about the work product's fitness for purpose. Namely, they start validating the work product (and the team members personally by extension), emphasizing what passed rather than failed.

This is only the beginning of the emotional nonsense getting in the way of proper testing. Regardless of whether the customer wants "Mr. Right" or "Mr. Right Now", firms and their employees tend to have an aspirational self-image that they are making the best widget. The customer's reality is usually some variant of "yeah, it's junk but it does X I want at the price I want and nobody else does". I can count on one hand the software packages that I can say without qualification are superior to their competition, and chances are good that "you ain't it".

Neverthless, this results in a level of "Quality theatre" where a great deal more scrutiny and rigor is requested than is required for proper and prompt decisionmaking. This is not opposed by most QAs, as they don't see beyond their own corn-pone. However, this means that some things which require greater scrutiny will not recieve them, as resources are scarce.

We don't make mistakes, we R SMRRT

Aspirational self image can also stand in the way of quality when management holds mistaken assumptions, or the engineers are sold on a defective design. Many fall into the trap of "needing to be right more than to be happy" and will doggedly defend their own errors unless tricked into believing a course correction was both their idea and a perfection of their earlier impulses leading to here. Testers that do not understand this run right into a brick wall when they don't "let them have their story" and try to fit the facts into their narrative in a way that gives them a face-saving out. Ultimately the "I want to be right" thing is just a desire to be heard (read: validated) which many of us pick up competing for parental attention as children -- you are a good kid!

Many testers also fall into this validation trap, and provide details which management regards as unimportant (but that the tester considers important). This causes managers to tune out, helping no one. This gets especially pernicious when dealing with those incapable of understanding second and third order effects, especially when these may lead to material harm to customers. When time is a critical factor it can be extremely frustrating to explain such things to people. So much so that sometimes you just gotta say It's got electrolytes.

Sometimes the business model depends on the management not understanding such harms. At some point you will have to decide whether your professional dignity is more important than your corn pone, as touching such topics is a third rail I and others have been fried on. I always choose dignity for the simple reason that customers who expose themselves to such harms willingly tend to be ignorant. Stupid customers don't have any money.

It's all these emotional pitfalls that explain why testers still cling to the description of their job as providing actionable information to decisionmakers. The only way to maintain your professional dignity in an environment where you can do everything right but still have it all go wrong is to divorce yourself from outcomes. Pearl divers don't really care if they sell to swine, you know?

Business analysis sounds boring, surely we don't need a specialist

Speaking of things outside of your control and higher-order effects, there are also market realities which must guide the tester as to what decisionmakers want to know, but will never tell you to look for. Traditionally this was done by business analysts who would analyze the competitve landscape in cooperation with testers, but that sounds a bit too IBM to most people so they don't do it. As such this is yet another task which it turns out you as a tester have to do more often than not. It is because of this that I developed a keen interest in business analysis and Economics.

The Iron Law of Project Management states that you have to pick two of three from:

Do it fast
Do it cheap
Do it well

The reality is that in a software business with good margins (read: non-saturated market), "Do it fast" is mandatory. In startups, it has to also be cheap because your customers are less willing to go out on a limb and you have less to spend. As such it should shock nobody that the vast majority of software is riddled with bugs (which is of course good news for the tester, it's a target-rich environment).

That said the people who want to hire testers are looking to transition from fast/cheap to fast/well by throwing money at the problem. This tends to run into trouble in practice, as many firms transition into competing based on quality too early, eating the margins beyond what the market and investors will bear.

If I have to think like a businessman and a customer, why am I a salaryman?

The primary reasons for these malinvestments are interest rate suppression and corrupt units of account being the norm in developed economies. This sends false signals to management as to the desirability of investing in quality, which are then reinforced by the emotional factors mentioned earlier. Something will have to give eventually in such a situation and by and large it's been tester salaries, as with all other jobs that are amenable to outsourcing to low-cost jurisdictions. Many times firms have a bad experience outsourcing, and this is because they don't transfer product and business expertise to the testers they contract with first. There is no magic bullet there, it's gotta be sour mash to some degree. Expertise and tradition take people and time to build.

Also, price is subjective and it's discovery is in many ways a mystical experience clouded by incomplete knowledge in the first place. It should be unsurprising that prices are subjective given quality itself is an ordinal and not cardinal concept. The primary means by which price is discovered is customer evaluation of a product and its seller's qualitative attributes versus how much both parties value the payment.

This ultimately means that to be an effective tester, you have to think like the customer and the entrepeneur. Being able to see an issue from multiple perspectives is very important to writing useful reports. This is yet another reason to avoid getting "too friendly with the natives" in your software development organization.

Behold the tester, high guardian of the brand

Speaking of mystical experiences, why do we care about quality at all? Quality is an important component of a brand's prestige, which is the primary subjective evaluation criteria for products. Many times there are sub-optimal things which can be done right, but only at prohibitive costs. Only luxury brands even dare to attempt these things.

Those of us mere mortals will have to settle for magic. In the field of Carpentry there's an old adage "If you can't conceal it, reveal it. Perfect joinery of trim to things like doorframes is not actually possible because wood doesn't work that way. So instead you offset it a bit. This allows light and shadow to play off it, turning a defect into decoration.

The best way to describe this in software terms is the load bearing bug. Any time you touch things for which customer expectations have solidified, even to fix something bad, expect to actually make it worse. This is because nobody ever reads change logs, much less writes them correctly. You generally end up in situations where data loss happens because something which used to hold up the process at a pain point has now been shaved off without the mitigant being updated. Many times this just means an error now sails through totally undetected, causing damage.

These higher-order effects mean in general that "fixing it" is many times not the right answer. Like with carpentry you have to figure out a way to build atop a flawed reality to result in a better finished product. This eventually results in large inefficiencies in software ecosystems and is a necessary business reality. The only way to avoid this requires great care and cleverness with design which many times is simply not feasible in the time allotted.

Assisting in the identification of these sort of factors is a large part of "finding bugs before they happen" especially in the maintenance cycle. It's also key to the brand, as quality is really just a proxy for competence. Pushing fixes that break other things does not build such a reputation, and should not be recommended save in the situation where data loss and damage is the alternative.

The only way to actually have this competence is to not have high turnover in your test and development department. Unfortunately, the going rate for testers makes the likelihood that skilled testers stick around quite low.

Process versus Mission driven organizations: Are we clear to go to public?

I have spoken at length about the difference between Process and Mission-Driven organizations. To summarize, Mission-driven organizations tend to emphasize accomplishment over how it is achieved, while process oriented organizations do the opposite. Mission driven organizations tend to be the most effective, but can also do great evil in being sloppy with their chosen means. Process driven organizations tend to be the most stable, but also forget their purpose and do great evil by crowding out innovation. What seems to work best is process at the tactical level, but mission at the strategic and operational/logistical level.

The in-practice organizational structure of the firm largely determines whether they embrace process or mission at scales in which this is inappropriate. Generally, strong top-down authority will result in far more mission focus, while bottom-up consensus-based authority tends to result in process focus. Modern bureaucracies and public firms tend to be the latter, while private firms and startups tend to be the former. This transition from mission to process focus at the strategic and operational level is generally a coping mechanism for scaling past dunbar's number.

In the grips of this transformation is usually when "middle management" and "human resources" personell are picked up by a firm. Authority over the worker is separated from authority over product outcomes, leading to perverse incentives and unclear loyalties. As a QA engineer, it is unclear who truly has the final say (and thus should recieve your information). Furthermore, it is not clear to the management either, and jockeying for relative authority is common.

The practical outcome of both this is that it's not clear who is the wringable neck for any given project. The personell manager will generally be unininterested in the details beyond "go or no go", as to go any further might result in them taking on responsibility which is not really theirs. Meanwhile, the project manager will not have the authority to make things happen, and so you can tell them everything they should know but it will have no practical impact. As such, the critical decisionmaking loop of which QA is a critical component is broken, and accomplishing the mission is no longer possible. The situation degenerates into mere quality control (following procedure), as that's the only thing left that can be done. For more information on this subject, consider the 1988 book "Moral Mazes: The world of corporate managers".

What then, shall we do? Ultimate Smackdown!

To actually soldier on and try to make it work simply means the QA is placing responsibility for the project upon their own head with no real authority over whether it succeeds or fails. Only a lunatic with no instinct for self-preservation would continue doing this for long (ask me how I know!), and truly pathological solutions to organizational problems result from this. It's rare to see an organization recognize this and allow QA to "be the bad guy" here as a sort of kayfabe to cope with this organizational reality. IBM was one of the first to do it, and many other organizations have done it since; this is most common amongst the security testing field. The only other (pathological) thing that works looks so much like a guerrilla insurgency for quality that it inevitably disturbs the management to the point that heads roll.

Ultimately, integrating test too tightly with development is a mistake. Rather than try to shoehorn a job which must at some level be invalidational and adversarial into a team-based process, a more arms-length relationship is warranted.

Testing is actually the easy part. Making any of it matter is the hard part.

Whither Perl: Blue Collar Blues 🔗

1624491461

🏷️ perl

I saw a good article come over the wire: The Perl Echo Chamber: Is perl really dying? Friends, it's worse. The perl we knew and love is already dead because the industry it grew up with is too...mature. The days of us living on the edge are over forever.

One passage in the linked article gets to half of the truth:


my conclusion is that it’s the libraries and the ecosystem that drive language use, and not the language itself.

This too was my feeling about programming languages for a long time, and why I know quite a good number of them. Hammer drive nail? Hammer good.

When was the last time you thought about making an innovative new hammer or working for a roofing company? I thought not. What I am saying is that when the industry for which a toolset is primarily associated with becomes saturated, innovation will die because at some point it's good e-damn 'nough.

The web years were a hell of a rush and much like the early oil industry, the policy was drill, baby drill. Someday you run out of productive wells and new drilling tech just isn't worth developing for a long, long time. We're here. The fact that there are more web control panels, CMSes and virtualization options than you can shake a stick at is testament to this fact.

It's not all bad news though

As they say, the cure to low prices is low prices and vice versa. Given enough time, web expertise will actually be lost, much like carpentry is in the US market (because it just doesn't pay.) The need for it won't, so what we can expect the future to look like is less "bigco makes thing" than "artisan programmer cleans out rot and keeps building standing another 20 years".

Similarly, innovation doesn't totally die, it just slows down. Hell, I didn't know you could do in-place plunge cuts using an oscillating saw growing up doing lots of carpentry, but now it's commonplace. Programming Languages, Libraries, databases...they're just another tool in the bag. I'm not gonna cry over whether it's a Makita or a DeWalt.

Get over it, we're plumbers now. Who cares if your spanner doesn't change in a century. If you want to work on the bleeding edge, learn Python and Data Science or whatever they program robots with because that's what there's demand for.

The Big Shift 🔗

1623865865

🏷️ video 🏷️ troglovlog 🏷️ corporate

People are wondering if the "Great Resignation" is real. I'm here to tell you that it is, and the consequences are farther reaching than you might suspect. #TheBigShift

I've already seen a number of job-hops by the most talented engineers at firms I'm in touch with here in flyover country, as they realize they can be paid better even at remote rates by joining a coastal firm.

This is a mirror of what is happening with people moving all over the country from the coasts to flyover country. The reasons for doing so are actually the same, however.

The nature of progressive taxation means that the most productive pay the most in, and as such the departure of this small minority of people has an outsized impact on tax base. The movement of these people from income tax states such as CA and NY to non income tax states such as TX and FL is having an outsized impact as such.

Similarly, the people leaving to get better remote work opportunities are those best capable of doing so; the most productive. Repeated study has shown that a minority of very productive employees do the vast majority of important work in the firm, so even a small number of high profile defections are going to have huge impact.

The question of course is "Why now?" Here, the reason for both is the same. Lockdown destroyed the inertia which was preventing movement -- the benefits of community which previously kept one from moving residence or firm were forcibly extinguished. This made all locales and firms roughly equal when it came to amenities, so you may as well just move to the highest paying and lowest taxing situation possible, as you can't exactly maximize for lifestyle anymore.

Now that people have actually pulled the trigger on movement they are realizing that those things keeping them where they were weren't that great in the first place, and actually little more than rationalizations for inertia. As such we can probably expect a de-emphasis on fringe benefits in the firm going forward.

As to the policy implications of this mass migration, it's sending an unmistakable message. Taxes are too high relative to what the citizenry of the coast get out of it, and have been for a long time. Unlike the firms, I don't expect they will adapt quickly enough to avert crisis.

25 most recent posts older than 1623865865

Prev Next Size: Jump to:

The impact of open source licensing on your business 🔗 v0 v1 v2 v3 v4 1636067872

The war on scraping is lost for the same reason as the war on piracy 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 1635962678

Can we fix it?

Time to get creative

Welcome to spamworld, where nobody reads 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 1635869893

Why can't we have nice things?

But wait! There's MORE

Power Distortions in the Firm 🔗 v0 v1 v2 v3 v4 v5 v6 v7 1635781264

Management by Reid Technique

Detect OOMs via cron 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 1635525018

Why are US firms having such a hard time hiring? 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 1635261979

Automated Filtering and JD inflation

Deceptive Offers, Liar Hires

Demographics

Expansion of Entrepreneurship

Missionary or Mercenary? 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 v25 v26 v27 v28 v29 v30 1634842858

PTR Records: waving a dead chicken 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 1634239190

2 Major paths to take advantage of CDNs 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 1634143208

Dealing with out of date slave zones after an IP migration 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 1633971893