The Law of Diminishing Clouds

The Law of Diminishing Clouds — when the number of computers outside of your control or visibility have a greater negative impact than the benefits of using the cloud. Can also refer to when the costs savings from the cloud is less than the internal effort and/or costs of figuring out what the cloud is doing.

Take a Break

SysAdvent is taking a break this year and I suggest you do the same.

Instead of cramming to stay current, get ready for your next job interview, and working late, spend it on something else.

One month focusing on a series of blog posts, books, or tutorials isn’t going to make up for anything you need to catch up on.

And if you’re like my it will take you days to unwind from work before you can actually relax. So instead focus on your family, friends, and yourself.

Do something that doesn’t require a screen, to be compiled, or a git push.

There will be plenty more months ahead to learn.

Too Many Tools

A couple of months back I decided to replace macOS with Linux on my personal computer. This was based around two things. The idea that I was using too many different tools on a daily basis and I wanted to thin out the list. For some unknown reason I was using multiple text editors (Ulysses, iA Writer, FoldingText, and nvALT). The second was the frustration I’ve experienced with the changes in macOS from the influence of iOS, quality issues, and Apple’s history with shedding one adapter for another in it’s hardware.

I spent a fair amount of time debating if this was really necessary and was I just looking for another thing to tinker with. And then on the question would I just install Linux on my MacBook Air or go looking for a new laptop. In the end I decided that I wanted to use the same OS I use for work on a new laptop that would have the best support for Linux. So I started looking for a replacement of the MacBook Air and Mac Mini I was using for personal use.

The first laptop I bought for myself was a Titanium Powerbook with Mac OS X and OS 9. At the time I was working as a contractor at IBM Global Services doing Unix support for SAP systems. There was plenty of different operating systems being used including a mix of mainframe, HP-UX, SunOS, Solaris, IRIX, Tru64, AIX, NT4, and OS/2 Warp. There was also a number of us running personal systems with BeOS, NeXTSTEP, NetBSD, SuSE, Slackware, or Red Hat. The point being there was a lot of different ideas and approaches on what an operating system could be.

This was when the operating system you used was generally decided by what you needed to do with it. The majority of uses could be satisfied with Windows, many creative and print (design and layout) needs were handled with Mac OS, and a smaller number could be addressed with an Unix-like operating system.

For people who where building and supporting things for the web, OS X delivered something that was missing. It brought mix of a Unix-like OS, a polished GUI, and support from a single-source vendor. The core pieces were released and open-sourced as Darwin, Jordan Hubbard (a co-founder of FreeBSD) joined Apple , and there was at least three different efforts around building or porting package management tools for it. This new direction from Apple created a lot of excitement. Ever since that TiBook I’ve owned four different Macs and it’s only been during the last year that I’ve considered something else.

Today the OS doesn’t matter as much.

Apple seems to be losing it’s place as the OS of choice for development work. Microsoft has added a Linux subsystem and a builtin SSH client to Windows, contributed research and work around containers and unikernels, released a version of Linux (Azure Sphere), and runs one of the top three public clouds. Linux now looks to be the leading OS for servers and mobile. You have Android, Chrome OS, IoT devices, support for ARM based devices, and Linux is pulling it’s weight in iron in the private and public clouds. Extending the computer to mobile and having the cloud as a way to pass messages between devices has opened up our options.

Now, as to which distro would replace macOS I thought about and tried Arch, but it just ended up reminding me of Gentoo. I’m not wanting to have installing and configuring Linux to be the equivalent of re-rolling the dice for the best D&D character sheet. The majority of the machines I work with our using one of the Ubuntu LTS releases. Because of this it became the obvious pick for the OS, but it was the choice of hardware that took the most time.

Originally I was looking at System 76, Lenovo, and Dell. At the time there was a bit of a debate between System 76 and GNOME over firmware updates from LVFS and some of the comments from System 76 left me with an impression of this is a company I’ll avoid for now. It was a review from Linux Journal that introduced me to the Librem laptops from Purism. Their focus on the privacy and security delivered in a product with a simple aesthetic is what sold me.

I settled on buying a Librem 15v3 and planned to install Ubuntu 18.04 LTS. I did give thought to sticking with Purism’s own OS (PureOS) and I may try it at a later time, but for now I want to be using the same tools at work and at home.

The thing that was the most intimidating was the idea of using Linux as a desktop replacement again. It’s been almost two decades since I’ve last used a GUI in Linux. That’s a long time to catch up changes in GNOME and KDE and then I came across AppImage , Flatpak, and Snaps.

Deliver Update Publish Execute System (DUPES)

My first thought was great another set of package tools to navigate, but I think that was a short-sighted view of what they provide. They’re more than just package and dependency managers. They’re closer to the architecture of an app store paired with the process to execute an app. To make it easier for myself I just started thinking of them as “dupes” instead of a package management tool.

Delivery — how the application is delivered to the user.

Update – how the application updates (e.g., auto, delta).

Publish — the process of building the app with dependencies for delivery.

Execute — how the application is run and the level of isolation provided.

System — tooling to provide the previous four points.

The Browser, Editor, and Terminal

Most of the time I’m using one of three different types of applications. Either a browser, text editor, or a terminal. The other native applications are tools for specific purposes. In some cases I use the iOS version, because it’s either easier to hold a tablet while reading, streaming something, or playing music.

Before I placed the order I started listing out the applications I use on a regular basis. A number of these apps were macOS / iOS only, but the ones I really cared about keeping either had a native Linux or iOS version available. The idea became to use Linux on the laptop and continue with the pairing of iOS and WatchOS on the phone, tablet, and watch. All I needed to add was a stand for the iPad Pro, reuse the keyboard from the Mac Mini and use the cloud to transfer files and I was set.

Any development, personal projects, or anything technical would be done on the Librem. Things like budgeting, health / fitness tracking, reading, writing, and streaming would be on the iPad Pro, iPhone, or Apple Watch.

Note: The lack of an IM client that supports Messages has been the most difficult hurdle. My wife and I both have iPhones and she has a Mac. By giving up macOS I’m forced to rely on my phone or tablet when chatting with her. And if one of us wants to send the other a link there’s a bit of a shell game needed if I want it on or send it from Linux.

Impressions after a Month

So far I’ve been happy with my choice. The only issue I’ve had is after an update for Ubuntu the firmware needed for the Bluetooth module was uninstalled and the wireless seems to require I use wicd network manager instead of managing the wireless through settings or the GNOME shell.

The configuration I ordered for the Librem included a 500GB NVMe drive that’s used for the OS and a second 120GB SSD drive. I run Gitlab and Nextcloud locally as containers that are both mounted on the second drive. This gives me some flexibility in how to transfer files between devices and I’ve enjoyed having a local set of tools I control.

There are a few things that are on the list to consider. Replacing 1Password with Bitwarden and deciding how I’ll do backups. Right now I’m thinking of something like Restic with B2 as the remote repository.

Overall the change feels like I made the right choice.

Brain Dump

Every wonder why you have so many ideas when you’re in the bathroom taking a shower, or a bath, brushing your teeth or that other thing we use the room for?

Ever notice that there are no screens in there to pull or hold our attention?

Take a minute and count how many screens are around you right now. How many different TVs, phones, tablets, computers, e-readers, portable video game consoles, smartwatches, or VR headsets are within your sight? How many of these are in the bathroom?

p.s. please don’t admit to owning a pair of AR glasses.

While having a discussion with my wife, who had a MacBook on her lap, a iPad closed on the ottoman next to her feet, and her iPhone within reach I said…

“the nuance is blurred” and then I had no reply as she was still reading whatever it was she was reading.

I waited a minute and then asked her “you know I just asked you a question right?”

She replied “yes, something about being blurred.”

I repeated what I had said “yeah the nuance is blurred” as I started to walk to the bathroom I added “Blur as in the band, as in song number two, as in what I’m going to do.”

Which I did and then noticed there are no screens in the bathroom and wondered maybe this is why we have flashes of clarity and creativity in here. Which reminded me we recently joked about getting an Amazon Echo in here so we could ask Alexa to take notes so we don’t forget things.

So as I step back into the living room, even before I’m through the bathroom door I’m calling out to her “Don’t say anything! I had an idea that I need to write down before I forget!” She was looking at me as I stepped out and I was left with the impression she was holding onto something to tell me.

I grabbed my laptop, sit down and start writing. At this point I’ve read everything that you’ve read so far to her and she laughs.

We talk about the fact there’s a screen in every room in the house except for the bathroom. We go over the semantics of are there really screens in the bedroom. Because you know the iPads we both have can follow us around. She uses hers through out the day and I leave mine at the bed side table to watch something when I go to bed. Which isn’t really going to bed as it’s just laying down and watching TV. No wonder I get so little sleep.

We talk about the screens even in our car. Hell the car even emails and sends us texts messages when it gets low on windshield wiper fluid. Our car has a drinking problem and it likes to let us know about it.

Used to be the only screen in the house was the one television set the family had and if you were well-off your parents had a second set in their bedroom to watch Johnny Carson together.

Instead we watch different things on our own screen that we hold out in front of us or sit on our bellies. Recently my wife asked if there’s an app so we can both watch the same thing synced up on our individual iPads. We laughed when we realized that yes there is and it’s called a television.

We seem to miss the opportunities we have to build something new and we underestimate the value of what we lost. Instead we build things that place each of us in our own individual world. Walled off with wireless headsets and a microphone, where you can be talking with someone who is ignoring the people around them or maybe you’re just listening to music that a machine picked to play for you.

Yes there is usually at least one screen in the bathroom called a mirror. It’s passive and only reflects back what we bring to it. That’s the key difference. That screen is passive and is used as a tool so we can brush our teeth or have a moment of self reflection.

Anyways just wanted to capture this before I lost.

I never did find out what she was holding on to tell me.

Attrition from The Small Things

If you find yourself thinking about what work lies ahead in 2018 consider the following. Doesn’t matter if you’re only thinking of the first month, quarter or the entire year. How many changes can your team handle before it culminates to where they’re no longer capable of performing their normal operations? How about for yourself?

While you’re considering your answer frame change as anything from work related to existing or new projects, ongoing work supporting existing products, incidents, pivots in priorities or business models, new regulations, reorgs, to just plain interruptions.

Every individual and team has a point where their capabilities become ineffective. We have different tools and methodologies that track time spent on tasks. We track complexity (story points) delivered in a given timeframe (sprints). Burn up charts show tasks added across time, but this is a very narrow view of change. A holistic view of the impact of change on a team is missing. One that can show how change wears down on the people we’re responsible for and ourselves.

Reduction of Capability

Before we started needing data for most decisions we placed trust in individuals to do their job. When people were being pushed too far too fast they might push back. This still happens, but the early signs of it are often drowned out by data or a mantra of stick with the process. It’s developed into a narrow focus that has eroded trust in experience to drive us towards our goals. This has damaged some of the basic leadership skills needed and it has focused our industry on efficiency over effectiveness. I’m also starting to think this is creating a tendency where people are second guessing their own abilities due to the inabilities of others.

This reinforces a culture where leaders stop trusting the opinions of people doing the work or those who are close to it. When people push back the leaders have a choice to either listen and take into account the feedback or to double down on the data and methods used. This contributed in creating the environments where the labels “10x”, “Rock Stars” and “Ninjas” started being applied to engineers, designers, and developers.

heroics — “behavior or talk that is bold or dramatic, especially excessively or unexpectedly so: the makeshift team performed heroics.” — New Oxford American Dictionary

Ever think about why we apply the label heroics or hero when teams or people are able to pull through in the end? If the output of work and the frequency of changes were plotted I’d bet you’ll find the point where sustaining normal operations was impracticable or improbable was passed before these labels are used.

Last month’s fatal Amtrak derailment that killed three people was traveling more than twice the speed limit (80 mph in a 30 mph zone). The automated system (positive train control) designed to prevent these types of conditions from happening while installed was not activated. Was this fatal accident on the inaugural run of a new Amtrak route an example of where normal operations were no longer possible? Is this any different than the fatal collisions involving US Navy ships last year due to over-burdened personnel and equipment?

For the derailment it looks like a combination of failing to use available safety systems and following safety guidelines contributed to the accident. There’s also the question was the crew given training to build awareness of the new route. The Navy collisions looks to be the result of the strain of trying to do too much with too few people and resources. This includes individuals working too many hours, a reduction in training, failure to verify readiness, and a backlog of maintenance on the equipment, aircraft and ships.

The cadence of change was greater than what these organizations were capable of supporting.

For most of us working as engineers, designers, developers, product managers, or as online support we wouldn’t consider ourselves to be in a high-risk occupation. But the work we do impacts peoples lives in small to massive ways. These examples are something that we should be learning from. We should also acknowledge that we’re not good at being aware of the negative impacts of the tempo of change on our people.

There’s a phrase and image that can illustrate the dependencies between people, processes, and systems. It’s called the “Swiss Cheese Model” and it highlights when shortcomings between them line up it can allow a problem to happen. It also shows how the strengths from each is able to support the weaknesses of others.

Swiss Cheese Model of Accident Causation

Illustration by David Mack CC BY-SA 3.0.

We have runbooks, playbooks, incident management processes, and things to help us understand what is happening in our products and systems. Remember that these things are not absolute and they’re fallible. The systems and processes we put into place are never final, they’re ideas maintained as long as they stay relevant and then removed when they are no longer necessary. This requires awareness and diligence.

In any postmortem I’ve participated in or read through there were early signs that conditions were unusual. Often people fail to recognize a difference between what is happening and what is expected to happen. This is the point where a difference can start to develop into a problem if we ignore it. If you think you see something that doesn’t seem right you need to speak up.

After the Apollo 1 fire Gene Kranz gave a speech to his team at NASA that is knows as the Kranz Dictum. He begins by stating they work in a field that cannot tolerate incompetence. He then immediately holds himself and every part of the Apollo program accountable for their failures to prevent the deaths of Gus Grissom, Ed White, and Roger Chaffee.

From this day forward, Flight Control will be known by two words: “Tough” and “Competent.” Tough means we are forever accountable for what we do or what we fail to do. We will never again compromise our responsibilities. Every time we walk into Mission Control we will know what we stand for. Competent means we will never take anything for granted. We will never be found short in our knowledge and in our skills. — Gene Kranz

I take this as doing the work to protect the people involved. For us this should include ourselves, the people in our organizations, and our customers. Protection is gained when we’re thorough and accountable; sufficient training and resources are given; communication is concise and assertive; and we have an awareness of what is happening.

When I compare the derailment and collisions, what Kranz was speaking too, any emergency I responded to as a fire fighter, or any incident I worked as an engineer there are similarities. They’re the results from the attrition of little things that continued unabated.

Andon Cord for People

Alerting, availability, continuous integration/deployment, error rates, logging, metrics, monitoring, MTBF, MTTF, MVP, observability, reliability, resiliency, SLA, SLI, SLO, telemetry, throughput and uptime.

We build tools and we have all kinds of words and acronyms to help us frame our thoughts around the planning, building, maintaining and supporting of products. We even allow machines to bug us to fix them, including waking us up in the middle of the night. Why don’t we have the same level of response when people break?

One of the many things that came out of the Toyota Production System is Andon. It gives individuals the ability to stop the production line when a problem is found and call for help.

We talk about rapid feedback loops and iterative workflows, but we don’t talk about feedback from the people close to the work as a way of continuous improvement. We should be giving people the ability to pull the cord when there is an issue that impacts the ability for them or someone else on the team to perform. And that doesn’t mean only technical issues.

What would happen if your on-call staff had horrible time that they’re spent after their first night? Imagine if we gave our people the same level of support that we give our machines? Give them an andon cord to pull (i.e. page) that would get them the help they need.

As you’re planning don’t forget about your people. Could you track the frequency of changes happening to your team? Then plot the impact of that against the work completed? Think about providing an andon cord for them. How could you build a culture where people feel responsible to speak up when they see something that doesn’t line up with what we expect?

“People, ideas and technology. In that order!” — John Boyd.

Too many times we think a solution or problem is technical. More often than not it’s about a breakdown of communication and then sometimes not having the right people or protecting them.

The ideas from Boyd are a good example of how our industry fails to fully understand a concept before using it. If you’ve heard the phrase OODA Loop you’ve probably seen a circular image with points for Observe, Orient, Decide and Act. The thing is he never drew just a single loop. He gave a way to frame an environment and a process to help guide us through the unknowns. And it puts the people first by using their experience so when they recognize something for what it is they can act on it immediately. It was always more than a loop. It was a focus on the people and organizations.