nikclayton

Mastodon: https://mastodon.social/@nikclayton

Prompted by the questions in https://hhmx.de/@nick/16376, the answer's too long to fit in a post.

Doing some monitoring how a original #Mastodon server responds, i never (!) saw the described link header! The response may include no link header or a header like this: Link: ; rel=“next”, ; rel=“prev”

You should get a Link header from any API response that returns a page of data, like https://docs.joinmastodon.org/methods/notifications/#get.

[Almost; in testing this I just discovered an edge case, which I've reported as a bug at https://github.com/mastodon/mastodon/issues/25495]

An API response for a single item, like https://docs.joinmastodon.org/methods/notifications/#get-one won't include the Link header.

I just tested this with https://restfox.dev against my mastodon.social account (https://github.com/tuskyapp/Tusky/pull/3610/files?short_path=00c65c2#diff-00c65c2349f395f9f9b78b0fe84d3638b78f3ef4abf98294364b5c4d0ef32473 has details on how to do this).

On my research i saw something in the #Tusky source, that looks like Tusky as clients seems to generate a link header? If so, why?

https://github.com/tuskyapp/Tusky/blob/develop/app/src/main/java/com/keylesspalace/tusky/components/notifications/NotificationsPagingSource.kt#L107-L178

That function needs to return a page of results as an HTTP Response. The code that calls that function unpacks the Response and parses the contents of the Link header, so the header must be present.

To figure out what page of results to return one of the parameters to getInitialPage is the ID of the item (in this case, a notification) that should be at the start of the page. This is the params.key value.

The Mastodon API lets you make requests like “Get me the page immediately after ID X” or “Get me the page immediately before ID X”.

But it does not let you make a request “Get me the page that includes key X”.

So the code has to do that itself, starting around line 134. It does this by making two API calls. One to retrieve the notification with ID X, and one to retrieve the notifications immediately after that.

If those calls succeed the code has two values to operate on; the notification, and the page of notifications immediately after it.

Remember that this function needs to return a single page of notifications. So it creates a fake page which contains both values. This fake page needs a Link header (because the code that calls this code expects a Link header).

So that's why it constructs one in this case.

This code is only called to get the first (or initial) page to show the user. After that the pages above and below are loaded with single calls to the Mastodon API, and return the Link header without any alterations.

Besides of details, if the header should be named “Link:” or “link:“,

HTTP headers are case-insensitive (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers) so you can use either.

Tusky 22 now sometimes reports an array index out of range (-1) or similar (taking this from my memory, for exact message, i have to check it again).

That would be a bug, a detailed report with reproduction steps would be appreciated.

During development i also saw Tusky sending requests for notifications in an endless loop... wondered why.

I suspect the min_id / max_id handling in your server is not correct.

And... could someone clarify the difference between sinceid and minid? If Mastodon will handle this differently?

sinceid = String. Return results newer than this ID minid = String. Return results immediately newer than this ID

What should be the difference between “newer” and “immediately newer”?

Suppose the user has 10 notifications, numbered 1 through 10.

  1 2 3 4 5 6 7 8 9 10
  ^                 ^
  |                 |
oldest            newest

They make the request:

/api/v1/notifications?since_id=5&limit=2

This means they want two (limit=2) notifications that are newer than notification 5 (since_id=5).

The server returns notifications 9 and 10, as these are the two newest notifications.

Think of since_id as “Start at the newest notification, count backwards limit notifications, and return from there”.

Same setup, but the request is now:

/api/v1/notifications?min_id=5&limit=2

This means they want two (limit=2) notifications that are immediately newer than notification 5 (min_id=5).

The server returns notifications 6 and 7, as these are the two notifications immediately newer than 5.

Think of min_id as “Start at min_id, then return the next limit notifications”.

Given that min_id was added to the API after since_id was my working theory is that this was a design flaw in the API, which min_id fixed after the API was first released.

Under the hood we're updating the Tusky design to make the code easier to work with and reason about, particularly with how timelines are handled.

A timeline is any list of posts. What you see on the Home tab is a timeline, so is the Notifications tab, Local, Federated, search results, trending tags, all of it.

This is a big task, and trying to do it to all timelines at once would be very difficult. So the work is broken up to focus on one or more timelines at a time. This also means that if there's anything that turns out to be a bad idea we can fix it on the affected timeline before it affects everything.

The timeline on the Notifications tab is the first one to get these changes.

So what's changed?

Missing posts

There have been occasional reports about posts missing from timelines (https://github.com/tuskyapp/Tusky/issues/3505, https://github.com/tuskyapp/Tusky/issues/3320).

This is fixed with the re-write. As I say, initially for Notifications, but on all timelines in the next major release.

No more “Load more”

The “Load more” list entry you might see as you scroll through your notifications? That's gone. Notifications now load automatically as you scroll through the list.

You can jump to the top of the loaded list by tapping the icon in the tab header (that's not new in this release). v22.1 will add a menu item to jump to the most recent notifications, even if they haven't been loaded yet.

Remembering the reading position

Your “reading position” is remembered. If you leave the tab and come back (even if you fully exit the app) you'll be restored back to the notification you were last reading.

If that notification no longer exists (maybe you dismissed it, or changed your filters so it's no longer visible) then Tusky tries to put you as close as possible to that position.

Better error handling

Your server might not be available. Maybe it's down for maintenance, or your phone doesn't have Internet connectivity. Tusky now handles this more gracefully, with better options to recover.

If you're scrolling through notifications and the next set fail to load the current set remain, and you'll see (a) more detail about the error, and (b) an option to retry. The error details are important, as they give you a better idea of what the problem is and how you might fix it.

If you interact with a post in the notifications tab (boost, favourite, bookmark, etc) and that fails you'll also be told why, with a “Retry” option.

Previously those failures could be invisible, and lead to user feedback like “I bookmarked this post, the icon changed, but it's not in my bookmarks”. That could happen because the “bookmark this post” request was sent to the server but the server didn't receive it. Tusky optimistically updated the UI to show the bookmark had succeeded, but didn't update the UI when it failed.

This was me, in https://github.com/tuskyapp/Tusky/pull/3159.

Tusky v22 changes how Android notifications are displayed.

To understand why the behaviour is the way it is it helps to know a little bit about how Android allows apps to send notifications. Strap in, this gets a little long.

Tusky notifications are sent to Android notification “channels”. Tusky creates one channel for each type of Mastodon notification you can get (boosts, favourites, mentions, etc).

Each channel is associated with a Tusky account, so if you're logged in with two accounts you'll have two channels for boosts (one per account), two channels for mentions, etc.

You can see all of this in “Account preferences > Notifications”.

Each channel has its own controls for how notifications appear. Do they pop up on screen? Do they make a sound? Do they ignore do-not-disturb, etc? That's so you can choose to get a sound if someone mentions you, but no notification at all if someone new follows you (for example). All of that is also controlled from “Account preferences > Notifications”.

Android apps can not see what the user has set their notification settings to.

This is so that a malicious app can't refuse to run unless you allow its notifications to make a sound — apps can't “blackmail” you if you disable their notifications.

Android imposes two other limits on how apps can send notifications.

First, an app can only have about 50 notifications active at once. It's “about 50” because different Android devices might set that limit differently, and there's no way for an app like Tusky to find out what the limit has been set to, or what will happen if it posts more notifications than that.

Second, if an app posts too many notifications too quickly Android may silently drop those notifications, and the user will never see them.

How does all that affect how you use Tusky?

Better summaries means more alerts

Previously, all your Mastodon notifications would appear as a single collection of Android notifications with a “summary” notification that collapsed them all.

You could expand the summary notification to see the individual notifications.

It didn't matter what type of notification they were (follow, boost, mention, etc) they all got summarised together.

Turns out that caused a bunch of bugs, which we've squashed. For example, having just one summary notification meant that it could be sent to a channel that you have muted. And then you'd never see them.

That's fixed — notifications always go to the correct channel.

So now you'll see multiple summary notifications. One summary for all your “mention” notifications, one for all your “followed you” notifications, one for all your “boosted your post” notifications, and so on.

This means you can choose to e.g., swipe to dismiss all your “boosted your post” notifications, while leaving the “mentioned you” notifications intact.

However, if you receive a lot of notifications you will probably find you're alerted more than you used to be. The old behaviour was a bug. Tusky was incorrectly hiding some notifications. Now it's not.

You can always control your notification channel settings in “Account preferences > Notifications”. You may want to check that out and either disable some of them, or switch some of them to “silent”.

Older notifications make way for newer notifications

Tusky now limits the maximum number of notifications it will display to stay under that 50 limit I mentioned. If you have more than 50 active notifications Tusky will drop them, oldest first. They still appear on the Notifications tab, they just won't be created as Android notifications.

Suppose you have 30 active Tuksy Android notifications, and receive another 30 new Mastodon notifications. Tusky will remove the oldest ~ 10 from the existing Android notifications and then show the 30 new notifications and the remaining 20 of the previous ones.

Notifications are created more slowly

Tusky will create those 30 new Android notifications more slowly. Previously it would create them as quickly as possible, which ran the risk of them being silently dropped by the device. Now it creates them roughly once per second.

If you get a lot of notifications all at once you'll see this when the notification icons flash briefly on the Android status bar at the top of the screen.

Synchronised with other apps

Mastodon apps like Tusky — or your server's web interface — can also sychronise the ID of the “most recently fetched” notification with your Mastodon server.

So if you have your Mastodon home page and Tusky open at the same time (or Tusky running on two different devices) a given notification will only appear in one place, so you don't have to dismiss the same information in two different apps or devices.

I did the work here.

Along the way we discovered lots of ways that not-Mastodon-but-claim-to-follow-the-Mastodon-API servers are not quite compatible with documented behaviour.

And some problems with Mastodon itself, like https://github.com/mastodon/mastodon/issues/25245.

We also found and fixed some nasty bugs if you were logged in with multiple accounts; Tusky could write the data from one account to the database for another account. If you were on the beta track and suffered through the period where you would get dumped at the very beginning of your notification history then thanks for bearing with us while we found and fixed the problem.

Tobi (https://goblin.technology/@tobi) on the GoToSocial team helped debugging GoToSocial issues (https://github.com/superseriousbusiness/gotosocial/pull/1867, https://github.com/superseriousbusiness/gotosocial/pull/1734, https://github.com/superseriousbusiness/gotosocial/pull/1719),

Daniel (https://mastodon.social/@dansup/) is looking into the work that is needed for Pixelfed; right now Pixelfed incorrectly reports a new notification every time Tusky checks (https://github.com/pixelfed/pixelfed/issues/4461) so if you have a Pixelfed account in Tusky you may want to mute those until Pixelfed is fixed.

Partly a note-to-self, partly a note to anyone else who has the same problem.

Out-of-the-box, Linux (at least 5.15.0-69-generic #76-Ubuntu) does not recognise the fans controlled by the Asus H770-PLUS D4 motherboard.

You can confirm this if running sensors shows no fan RPM information, and if the system gets under any kind of load (e.g., running an Android emulator) the fans spin up to what sounds like maximum speed, and stay there, irrespective of the actual CPU temperatures.

To fix this boot the kernel with the acpi_enforce_resources=lax flag.

On my system (Linux Mint) that was by:

  1. Edit /etc/default/grub, adding the flag to the existing GRUB_CMDLINE_LINUX_DEFAULT entry
  2. Run sudo update-grub
  3. Reboot.

Re-run sensors and confirm that there is fan information.

I occasionally help out with OpenTechSchool Zurich, answering questions from folks getting in to programming. There was a question about the difference between function statements and function variables in Javascript.

This is an edited version of the explanation I gave, in case it's useful to anyone else. It was originally in a chat channel with people reading with a wide mix of experience levels.

“All teaching is the process of lying, and then replacing the lies with successively better approximations of the truth.”

Some of what follows is lies. But it lets us get closer to the truth


The original question was:

Especially what confuses me are the functions as there seems to be existent two type of functions like Function expression AND Function statement.

It's not two different types of functions. It's two different ways of referring to a function.

A quick recap of values, types, and variables

I'm going to recap values, types, and variables, to make sure we're on the same page.

It's important to keep these concepts distinct.

At its core, your program is going to be operating on values. A value might be a number like 3, or a string of characters like “hello”, or many other things.

Each value has a type. It's a bit circular, but this is the type of thing it is. 3's type is a number, "hello"'s type is a string, 3.0's type is a floating point number, and so on.

You can't have a value without it also having a type. They go hand-in-hand together.

Different programming languages have different types available to them. Javascript doesn't have many

Programs would get pretty confusing if we had to use values all the time. Consider the value 3; is it supposed to represent a temperature, a calculation, a shoe size, a distance, ... ?

So we have variables.

A variable is a way of labelling a value with a name.

Think of a variable as being like a box. You put the value in the box, and write a label on the outside of the box telling you something about the value inside the box.

And the box has a window on the side, so you can see the value in the box.

You can think of let temperature = 3.0; as “Create a box, put the value 3.0 in the box, and write 'temperature' on the outside of the box”

Some programming languages require you to say what type of value you're going to put in a variable before you can use the variable.

Javascript does not. You can create a variable containing a string, and then put a number in the same variable later.

let x = "some string";
...
x = 3; // <-- this is allowed

Python is similar to Javascript.

In some other languages, like Java you need to specify the type of value you can store in the variable. Later, if you try and put a value with a different type in the same variable you will get an error.

String x = "some string";
...
x = 3; // <-- this is not allowed in Java,
       // x can only contain values with type "String",
       // and the type of 3 is "int"

Our three key concepts now are:

  • values, like 3, “hello”, and 3.0
  • types, like number, or string. Each value has a type
  • variables, ways of labelling values so they have meaning and can be reused

Javascript can tell you what the type of a value is using the typeof operator.

> console.log(typeof 3);
number

> console.log(typeof 3.0);
number

> console.log(typeof 'hello');
string

> let x = 4;
> console.log(typeof x);
number

The last line is interesting. It will print number, but that's not the type of the variable, it's the type of the value in the variable.

What does this have to do with functions?

Functions are also values, and have types

This is both important, and confusing.

Consider a function like this:

function foo() {
  return 42;
}

What's the type of the function?

Most people will say “number” — the function returns a number, so when they see the question “What's the type of the function?” they actually see “What's the type of the value returned by the function?”.

And the function does return a number. So it's the correct answer to the second question.

To be fair, most of the time, it's what the question actually means, so it's a reasonable assumption.

But it's the wrong answer to the first question. The type of a function in Javascript is function.

You can check this with:

> function foo() { return 42; }
> console.log(typeof foo);
function

ASIDE: Javascript is pretty limited in this respect. Other languages will be more specific, and tell you the type is function () -> number or similar, which means it's a function that takes no arguments and returns a number.

Because it has a type a function is also a value.

For the purposes of this example you can imagine a function's value is all of the code in the function.

If a function is a value, and it has a type, then it's no different to 3, or "hello", or 3.0, which are also values with types (every value has a type!).

This means we can put a function in a variable. Like this.

let my_func = function() {
  return 42;
};

And we can call the function in this variable.

> console.log(my_func);
ƒ () {return 42;}  // <-- output

Wait, what's going on with the output? That's not 42.

Imagine we had done this:

> let x = 3;
> console.log(x);
3

Javascript is taking the variable x, and telling you the value inside the variable. 3 in this case.

That's exactly what's happening with the console.log(my_func); example. The value of the function is being shown to you as ƒ () {return 42;}

If you want to call the value of the function in the variable you have to use () the same as any other function call.

> console.log(my_func()); // <-- note the extra ()
42

And if you try and use a value that isn't a function type you get an error message.

> let x = 3;
> x();
Uncaught TypeError: x is not a function

The error message could be more precise. It should really say “The value in x is not a function”, because in both cases neither x or my_func are functions, they are variables. It's the value inside the variable that's important.

Like I say, it's important to keep the value / type / variable distinction clear.

Function statements and function expressions

Knowing all this we can look at the difference between function statements and function expressions.

A function statement looks like:

function foo() {
  return 42;
}

A function expression looks like:

let foo = function() {
  return 42;
};

These both do the following identical things:

  • Create a variable
  • ... called foo
  • ... that contains a value
  • ... the value is ƒ () {return 42;}
  • ... and the the type is function.

The only difference is where the variable is visible from.

Hopefully you're familiar with variable visibility already. It means that this code won't work as intended.

console.log(x);
let x = 4;

On line 1 you try and use the value in the variable x, but that variable is only created later, on line 2.

It's the same thing with function expressions. If you tried this:

console.log(foo());
let foo = function() { return 42; };

that won't work either. The variable foo is used on line 1 before it's created on line 2.

But this next example does work (in a program, not from the Javascript console).

console.log(foo());

function foo() {
  return 42;
}

When you stop and think about it, this is a bit weird. We're able to use the function in the code before we've defined it. How?

Javascript calls this function hoisting.

A function declared using a function statement is “hoisted up” (for non-native English speakers, “hoist” is a synonym for “lift”) so it's available before any code that uses it.

A function declared using a function expression is not hoisted, so it's available only after it's been declared.

And that's the difference.

Because life's too short to spend it reading terrible interview feedback

This is my rough guide to how to prepare to do good tech interviews and provide useful feedback. It’s not a one-size-fits-all guide; it’s the best practices I’ve adopted after giving more than 400 technical interviews to candidates with all ranges of experience and backgrounds who are applying for roles with a strong software engineering component to them.

It assumes you're an interviewer using the typical tech interview format of a 45-60 minute discussion with a candidate, generally focusing on a specific area of knowledge.

I'm not saying that’s the right way to conduct candidate interviews, or commenting on the various merits of this approach vs. take-home coding interviews, or pair-programming with an interviewer, or any other approach. But if this is the approach your organisation has taken to interviews, this is how you can conduct them well.

Before the interview

I assume:

  • You have time to prepare for the interview
  • You are familiar with the role and level the candidate is interviewing for, and the role’s requirements
  • You are not the final hiring decision maker; you are providing feedback to a hiring manager or committee

Your goals for the interview are:

  • Allow the candidate to perform as well as they can
  • Identify the candidate’s skills and ability at those skills
  • Provide a clear, unambiguous hiring recommendation to the decision makers, comparing the candidates skills and abilities against those required by the role
  • Provide sufficient information about the interview to allow the decision makers to disagree with your recommendation

Set aside time to prepare

You must go into the interview prepared, knowing what you expect to ask the candidate, likely detours, and expectations for their performance.

If you're not used to interviewing allow several hours for this. As you do more interviews this should take less time, but with more than 400 interviews under my belt I still allow at least an hour.

Do this several days before the interview if possible. 30 minutes before the interview is the wrong time to discover you’ve been asked to interview a candidate you’re not qualified to interview.

Form an expectation

You should go into the interview with an expectation as to how the candidate will perform on the questions you ask them. This allows your feedback to focus on whether or not they met those expectations.

This also allows the hiring group to decide your expectations were wrong and evaluate your feedback in the context of their expectations instead.

Your expectations about a discussion on a topic with a candidate with 2 years experience and one with 10 years experience should be very different.

Read the candidate's CV. Pay attention to:

  • Their most recent roles. Anything in the last three years is reasonable to expect them to be good at. Anything before then is going to get increasingly rusty.
  • Specific projects they have worked on, and their role on those projects
  • Specific technologies they have used
  • Anything that overlaps with areas you are an expert in

If they list them, look at any open source or other public contributions the candidate has made. You're looking for things that will let you go into the interview thinking “If I ask the candidate about X then they should be able to do very well”.

Do not form an opinion if there isn't any open source work to review. You have no insight into why, and it is also inappropriate to ask the candidate. This gives you zero signal.

Setting the expectations down before you've met the candidate may also help prevent forms of bias creeping in after you've had the chance to meet them. This is a double-edged sword, of course, be careful of any biases you may have from the information in their CV.

Prepare questions to ask

You probably have a set of questions you're comfortable asking and stick to those where possible. That's ok, but don't blindly repeat the same question interview after interview.

Tailor the topics to the candidate's experience

You want the candidate to do as well as they can. So focus on playing to their strengths, while still being relevant to the role. If they do not meet the expectation you want to be able to say “I gave them every opportunity to demonstrate their abilities”.

For example, if you typically start infrastructure scaling discussions using web services as an example, but the candidate doesn't have experience with web infrastructure, but does with mail (SMTP) infrastructure, adapt your questions accordingly.

Don't: Ask questions you can predict the candidate won't do well on. For example, anything involving the minutiae of a programming language they last used 3 or more years ago doesn't make sense — they're probably too rusty on the specifics to get a useful signal.

Or if the candidate's CV suggests they have experience working with systems of ~ 10 servers, don't ask them to design something that needs ~ 1000 servers — once you go up by more than two or three orders of magnitude the scaling issues and bottlenecks become very different. Since you can already predict the candidate will probably do poorly on the question there's no point in asking it.

Don't: Get hung up on having, e.g., three topics prepared, and you want to make sure you get to discuss all three topics. If you've started on a topic and you've found yourself having an interesting, deep technical conversation with the candidate and you feel like you can continue to explore the topic and learn more about the candidate's abilities then do so.

But, if you feel you have got all the information you need from the discussion (e.g., their answer so far is going to help you make a firm yes/no decision) then you can stop early and move to the next topic.

Start writing your feedback

Once you have your understanding of the candidate, the topics you're going to cover, and your expectations, write those down as the skeleton of your draft feedback.

This helps speed up the process of submitting feedback after the interview.

It also helps guard against any bias happening during the interview. Perhaps the candidate has a particularly charming personality, and you've left the interview convinced they should be hired because of that instead of their skills. Setting out your expectations now helps ensure they are not coloured by experience during the interview.

During the interview

Keep track of time

Pay attention to the time in the interview.

You want to make sure you get through the topics you had planned to cover (but see the previous section, re being flexible).

You should note down how long the candidate spent on certain topics. How long a discussion takes to hit the points you hope to cover should be part of your expectation going into the interview.

Don't Interrupt.

Allow the candidate to think. This is especially important in a design or coding interview.

Silence during an interview while a candidate is thinking can feel like an eternity.

Resist the temptation to fill the void by talking while the candidate is thinking, designing, or coding. Similarly, if you see they've made a mistake do not point it out immediately, wait.

Talking can disrupt their flow of thought — imagine if you were trying to write code with someone looking over your shoulder interrupting every time they noticed a typo.

It also robs the candidate of the opportunity to notice and fix the mistake themselves. Seeing whether a candidate can spot a mistake on their own, and how they fix it, can be useful information.

Make sure you have told the candidate this at the start of the interview, and it's ok for them to ask questions or for help if they get stuck on something. But wait for them to ask, do not offer it unbidden.

After the interview

Finish writing up your feedback.

You need to:

  1. Provide a clear opinion, supported by evidence, as to whether or not you think the candidate should be hired
  2. Provide enough evidence so if the hiring committee decides they disagree with your opinion they have enough information to reconstruct what happened in the interview and form their own opinion.

I recommend you structure your feedback as follows.

  • Summary, consisting of three paragraphs.
    1. Explain what you understood from the candidate's CV
    2. Set out your general expectations
    3. A conclusion, explaining whether you think the candidate should be hired. If it helps, start this with the phrase “We should hire X because...” or “We should not hire X because...” Be extremely clear in explaining your conclusion, calling out specific parts of any answer you think was particularly strong or weak. If you have not come to a conclusion then you've wasted the interview
  • Discussion sections
    • For each topic you discussed set out:
      • The initial question, as you asked it to the candidate
      • Your expectations ahead of the interview
      • Details from the discussions
      • Your assessment of the discussion relative to your expectations

Include code and designs

If the candidate designed something in the interview, include the design. If it went through multiple iterations, include the iterations.

If the candidate wrote code during the interview include the code in your feedback. If it went through multiple iterations, include the iterations.

Be extremely clear in your feedback about what you told the candidate about what is/is not acceptable in code.

For example, you will probably get different code from a candidate if you say “I just want to see something that works, don't worry about making it testable, or checking for errors” than if you say “Please write code you would be comfortable sending for a code review”.

If you told the candidate the former, and the hiring group doesn't know, they may well err on the side of caution and assume the candidate always writes scrappy code.

If the candidate called out anything they said they would normally do, but are omitting for speed during the interview then call that out too. For example, if the candidate is writing a function that has to interact with the current time they might say something like “Ordinarily I'd pass an object in with methods to get the date/time here, because it makes this much easier to test. I'm going to skip that for the moment” then this is valuable (and good!) thinking on the part of the candidate. Note it down.

Appendix

Sample feedback

This is an example of the feedback format, and the level of detail I want to see when I'm trying to figure out if we should hire someone.

If your response to this is “Holy crap, that's a lot of detail, I'm a busy engineer, I don't have time to do this” then I'm here to tell you you're wrong. Growing the size of the team with skilled engineers is one of the largest-impact things you can do. The recruiting effort requires time from coordinators, recruiters, other interviewers, and, of course, the candidate. Don't waste their time by turning in a single paragraph of poorly detailed non-specific feedback.

Summary

Alexa has been working as a developer for 4 years, most recently implementing the design of a backend system for a crowd-funding site, growing 100x in volume over the last two years, written in Ruby. She's part of a team of 6, and appears to have been promoted up the eng. ladder once in those 4 years, which is promising.

Expectations: She should be able to talk about the project in detail and be comfortable writing code. I'm not expecting significant design skills yet due to relative lack of experience and being on a team where others are doing the bulk of the design work.

Conclusion: Hire. She communicates clearly and can talk in a detailed, organised fashion about the work she's been doing. She had sensible ideas about improving the project to scale it out. Her code was clean, and idiomatic, the tests made sense, and she didn't over-engineer things to deal with problems that don't exist. Based on this, I am confident I could give her a project like X and she would be able to complete it independently, which meets the expectations for this role.

Topic: Describe your most recent project

Expectation: She's been on this project for 4 years, 2 as a junior dev, 2 as a mid-level dev, so should have a solid understanding of it, the tradeoffs, and the areas she feels can be improved. She should also be able to explain, in detail, how the system has scaled, and failure modes that have been anticipated and designed around.

What happened:

[Include the narrative here, which I've skipped because I'm writing a blog post, not interview fanfic]

Conclusion: Good.

  • She gave a detailed, organised description of the system, structured around the path a user's request takes (I thought this was a good approach, it introduced each area of the system in turn, and in a relevant context, each explanation building on information from the previous explanation).
  • She understood the current scaling issues the system is facing, and the suggestions she made for fixing them are sensible
  • She was able to talk about how the current system could be split up along service boundaries to re-implement services in more performant languages to deal with future scaling issues.
  • Her proposed approach to testing the migration (modify frontends to send query traffic to old and new backends, compare the results for equality, export data about the result, throw away the result from the new backend) is sensible.

Topic: Write FizzBuzz in your language of choice

The problem is to, for the numbers from 1 up to some limit, print either:

  • “FizzBuzz” if the number is divisible by 3 and 5
  • ... or “Fizz” if the number is divisible by 3
  • ... or “Buzz” if the number is divisible by 5
  • ... or the number

I asked for code she would be comfortable sending for initial code review, but not to worry about documentation comments for functions. I did not specifically ask for tests or any error checking to see what she suggested.

In case it wasn't clear, this is a joke question. Do not ask this. Tom Dalling has a great blog post, “FizzBuzz in too much detail”, https://www.tomdalling.com/blog/software-design/fizzbuzz-in-too-much-detail/ which explores this rabbit hole

Expectation: She lists Ruby as her preferred language, and the project she's been working on for the last four years is Ruby-exclusive, so she should be very familiar with it. I expect initial working code in less than 10 minutes. If she doesn't suggest it, I'll ask for tests.

What happened:

She noted down the problem to refer back to it and repeated it back to me, said she'd focus on getting something that worked first, then clean it up, opened an editor, and in just over 3 minutes had written:

def fizzbuzz(limit)
  1.upto(limit) do |i|
    if i % 3 == 0 && i % 5 == 0
      puts 'FizzBuzz'
    elsif i % 3 == 0
      puts 'Fizz'
    elsif i % 3 == 0
      puts 'Buzz'
    else
      puts i
    end
  end
end

This has a bug (she repeated the i % 3 case for Buzz). Before running the code she re-read it through and spotted the problem and corrected it without my prompting.

After fixing the code she ran it and confirmed it worked. Unprompted, she said she'd like to quickly re-write it to make this problem less likely. I agreed, and she re-wrote it to:

def fizzbuzz(limit)
  fizz = (i % 3 == 0)
  buzz = (i % 5 == 0)
  1.upto(limit) do |i|
    puts case
      when fizz && buzz then 'FizzBuzz'
      when fizz then 'Fizz'
      when buzz then 'Buzz'
      else i
    end
  end
end

She wondered aloud about whether this should accept more parameters to specify the text to print, or the specific divisors to use, and decided this was probably over-engineering, unless there was a business case for it. I agreed.

She said she'd want unit tests for it before it was ready for review, and I agreed. She started sketching out a test, then quickly realised the function has side effects (printing the output), so rewrote it to return the results, and a set of tests to verify the return value.

# Assume I'm sufficiently familiar with Ruby to write that code and test here.
# I'm sure you get the gist

This was the final code. We spent 25 minutes on this topic.

Conclusion:

Good:

  • Most importantly: Wrote working code that solved the problem
  • Wrote the problem down first, repeated it back to confirm understanding
  • Focused on getting something working instead of over-optimising from the beginning
  • Read the code to double check before running it, spotted a problem, fixed it
  • Implemented sensible refactorings
  • Suggested writing tests without prompting, as part of making it review-ready
  • Good sense to not over-engineer code without reason

Could be better

  • Could maybe have realised earlier that printing the results directly would make the code more difficult to test. Although as an iterative approach to development I'm fine with it.

I'm worried by a trend I see where software intended for use in production systems is defaulting to automatically loading configuration information from environment variables.

I think this is a bad idea.

Initial configuration to a production system should be explicit and visible, so configuration should come from one of two places:

  • Command line flags
  • Configuration sources referenced by command line flags (e.g., URL to a service that the process can contact to fetch additional configuration)

Note: How those command line flags are generated and provided is a separate, orthogonal, story. Maybe it's a systemd service file. Or a Kubernetes job definition. Or an Ansible role. Or something else.

Using environment variables by default is an attractive nuisance that complicates troubleshooting and reproducibility.

Troubleshooting

When troubleshooting a service I'm almost certainly going to need to try and understand what its configuration is.

The service might provide a dedicated interface to see this, but since there's no standard for this each service's mechanism is likely subtly different. This more information to have to recall when managing a system.

Everything accepts command line flags. So looking at the command line a program was started with (whether that's in a log line, from running ps(1), inspecting the service file, reading the Kubernetes configuration, on a dashboard somewhere, ...) is the universal way to do this.

Using environment variables by default breaks this. Now I have to have a mechanism for seeing the environment variables the service was started with. Either this is going to show me all the environment variables, and I have to remember which one was relevant. Or the mechanism is going to have to know, on a service by service basis, which ones are important and only show me those.

This, and any other mitigation measures like this are added complexity that could be avoided by not automatically using environment variables in the first place.

Reproducibility

This is closely linked to troubleshooting, because when troubleshooting a problem it helps to be able to reproduce it.

If your service is reading configuration from environment variables it no longer suffices for someone to file an issue and provide a simple command line to reproduce the problem.

Now they need to know that the service is automatically loading environment variables, and they need to provide those as well. And if they don't you then you have to (a) know that these variables are important, and (b) get back to them and ask for this additional information.

When is it OK?

There are some occasions where this is OK.

It's not a production service

For non-production services, sure. Anything goes. As it says at the top of this post, these are my rules for handling production services.

It's an extremely common environment variable

There are a number of environment variables that are so widely used that it's impossible to escape using them. Things like HOME, LC_*, PAGER, TZ, and so on.

https://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html has a good list.

Even if you do this you're still creating troubleshooting and reproducibility problems for yourself, especially with the LC_* and TZ variables.

In fact, I think there's an argument to be made that production-grade software that deals with locale related information should refuse to read the LC_* and TZ variables and require that that information be provided as explicit configuration.

If that means that the command line is a bit longer because it has a --tz=${TZ} entry I think that's a small price to pay for making critical configuration information like this completely explicit.

Here are some rules for a healthy on-call rotation, based on personal experience and benefiting from the experiences of others.

If you don't like the word “rules” then consider them to be guidelines or recommendations instead.

Some of these are practical things that you can implement immediately. Others are cultural changes that might take a while to embed in your organisation.

Note: The practice of responding to, managing, and communicating around incidents is a whole other topic, and not covered here.

Most of them are, in some way, intended to help manage information, and make sure it is clearly communicated within the on-call team and to the rest of the organisation.

It's likely incomplete, I'll add to it as I consider / remember more things.

Assumptions

There's a few assumptions in what follows:

  • Your service needs 24x7 on-call support
  • You have at least preliminary SLOs for the service, and pageable alerts are generated when the SLOs are exceeded or the error budget is being consumed too quickly
  • Expected response time is measured in minutes, not hours. No more than 10 minutes should elapse between the on-call engineer being paged and them acknowledging it and being hands-on to remediate the issue.

You will need a certain level of organisational maturity in order to achieve all of these.

tl;dr

Put together an on-call team like this:

  • The team is split in to at least two groups (“sites”), as close to 12 hours apart as possible (e.g., Berlin and San Francisco)
  • Each half of the team contains between 6 to 9 people
  • The team members are engineers
  • Each week one person from each half of the team is on-call for 12 hours per day. They are the primary on-call for that 12 hour shift
  • In the same week, someone else from the team in each site is the secondary on-call
  • The secondary for a week will be the primary the following week
  • Compensate time as primary on-call outside normal business hours
  • Don't try and be on-call and do project work at the same time
  • On-call is on the engineering job ladder

At least two sites, close to 12h apart

Do: Split the on-call responsibility amongst at least two teams, close to 12 hours apart. Or 3 teams close to 8 hours apart.

Why: Don't put people on-call 24x7 and expect them to be good at it. Someone getting paged at 3am in the morning is not in a fit state to mitigate whatever problem they've encountered, and one or more nights of badly interrupted sleep are terrible for productivity and health the next day.

Another way of thinking about this is “Don't set SLOs you are not staffed to meet”.

My experience is mostly with companies with a presence in California and Europe, and I've seen a rotation split 7am-7pm CET / 10am-10pm Pacific work pretty well (with the caveat of the few weeks each year where daylight saving changes throw everything off by an hour).

6-9 people per site

Do: The on-call rotation for the service in each site should be 6-9 people.

Why: 6 people as the minimum per site implies each person in the rotation in the site is going to be on-call at least once every 6 weeks.

Realistically, with team members being out sick, or on vacation, or shifts needing to be swapped around, every 5 weeks is more likely.

On-call any more frequently than that and it's extremely difficult for individuals to make progress on their project work — they're being interrupted too frequently.

You can go up to 9 (perhaps even 12) people in the rotation per site, but any more can be problematic if the rate of change in your production infrastructure is high. You risk someone going on-call without enough knowledge/context about the production infrastructure to be effective.

Primary and secondary on-call responsibilities

Do: Each week have a primary and secondary on-call responsibility.

The primary's daily responsibilities are (in rough order of priority):

  • Respond to incoming pages for the service
  • Triage, assign, and work on tickets/issues assigned to the service
  • Triage, assign, and respond to organisational questions about the service
  • Prepare the daily handover notes (see later)
  • Prepare for the weekly handover meeting (see later)

The secondary exists to be the backstop / safety net for the primary if they are unavailable or unable to respond. They are also the first person to step up if primary is dealing with a higher priority task from the previous list and a lower priority task can be dealt with.

So they should periodically — once or twice per day — set aside a few minutes to review any issues that have come in with the primary to ensure they are up to date with the state of the service, but shouldn't need to do more in a typical week.

Why: The primary on-call in each site is the person who gets paged for the service first.

That might be due to known issues (a commute where on-call response is impractical, child care responsibilities), or unplanned (technology failure).

In the case of planned unavailability it is the primary engineer's responsibility to co-ordinate with the secondary ahead of time.

In a well-running shift the secondary never receives an unexpected page.

The secondary at week N will be the primary in week N+1, allowing them to carry over the state they've absorbed from their week as secondary in to the primary week.

Example: In one on-call team I had a ~ 30 minute commute to the office where it would be difficult to respond to a page (10 minutes on a bus, 10 minutes on a train, a 15 minute walk, and some waiting time). Whenever I was primary I would co-ordinate with the secondary to make sure we were not commuting at the same time, and would positively acknowledge to them before the start and end of each commute.

Have at least three escalation levels, maybe more

Do: The primary on-call for the service gets paged first (level one).

If the primary doesn't acknowledge the page within some time, the secondary gets paged (level two).

If the secondary doesn't acknowledge the page within some time, everyone in the on-call team in that site gets paged (level three).

Optional:

If the team in that site does not acknowledge the page, the previous primary on-call in the other site gets paged.

Note: It's the previous primary, not the next primary, because if you're unlucky this is happening on the day of a shift change, and the next primary doesn't necessarily have a lot of context about the state of production.

Then the previous secondary in the other site.

The the on-call team in the other site.

Note: This is extremely belt and braces. In 15+ years of working in rotations like this I can count the number of times are page was missed by both the primary and secondary (and therefore fell through to the team in one site) on the fingers of one hand.

On-call team for a service is composed of engineers for that service

Do: The on-call team is composed of engineers, and ideally drawn from the regular software engineer population.

Why: If you're writing code that might page someone, you need to be prepared to be the person that's getting paged.

This doesn't mean the team is wholly composed of product software engineers. Perhaps a 2/3rd to 1/3rd split, with product engineers doing a 6 to 9 month “tour” with the team.

Note: “service” here might cover just a single replicated server, or it might cover a whole fleet of different servers working together to provide a service to your users. It really depends on your product and its architecture.

The team doesn't exist to be on-call

Do: Recognise that the team does not exist to be on-call.

Why: I've used the phrase “on-call team” repeatedly throughout this text, it's a convenient shorthand, but it risks obscuring a fundamental truth.

This is not a team that exists to be on-call.

This is a team that exists as one of many teams working together to try and ensure the service is meeting its SLOs.

Being on-call is a tool the team uses to help it achieve the goal, but that's it. Being on-call is not the goal of this team. Helping to prevent incidents, and swiftly mitigating the ones that do occur, is.

Compensate on-call appropriately

Note: There are legal issues here. For example, I believe that in Switzerland, you need (a) special approval from the govt. before you can ask require employee to work on a Sunday, and (b) employees must be able to take their accrued time off from on-call work within a certain period of time. I am not a lawyer, this is not legal advice.

On-call work should be compensated appropriately, and with no regard to whether or not any of the team were actually paged. Having to be on-call with a short response time is disruptive, even if you don't get paged.

My recommended baseline is as follows:

Suppose a 12 hour on-call shift and a 9 hour normal working day. Compensate the 3 extra on-call hours during the week days with equivalent time off at 50%, and the 12 extra on-call hours during the weekend days with equivalent time off at 100%.

I.e., someone doing one week of on-call, 12 hours per day, accrues 5 days x (3 hours @ 50%) + 2 days x (12 hours @ 100%) = 31.5 hours, or 3.5 working days.

This time is to allow people to recover from an on-call week, and catch up on activities that they were otherwise unable to do. I recommend that you strongly encourage employees to take this time relatively shortly after their on-call week ends (not necessarily in one large block), and expire any un-taken time on a rolling 9 month basis.

This is for time as the primary only. Time as the secondary is not ordinarily compensated, as it is not ordinarily that onerous.

Note: Special circumstances can occur. If there is an incident requiring many people to work over a weekend then treat it specially, and figure out how to fairly compensate everyone involved.

Don't mix and match on-call and project work

Do: When someone is primary on-call their regular project work does not exist. Their job is to be the primary on-call.

Why: Humans suck at multi-tasking. It can be tempting to try and be on-call and make progress on primary project work at the same time.

It doesn't work, and can be a significant source of stress. So don't do it.

That doesn't mean a particularly quiet on-call week sees the primary on-call sit around twiddling their thumbs wondering what to do. There's always something to do that can be safely interrupted:

  • Improve a runbook or other team documentation
  • Read a technical paper from the backlog
  • Experiment with an automation tool
  • Watch a video from a relevant conference you couldn't attend

and so on.

Note: On interviewing — you can, if necessary, be primary on-call and still carry out interviews. If you do then first clear it with whoever is secondary on-call and make sure they can cover for you for a period some time before the interview (however long you need to prepare) and after (however long you need to provide feedback). If this is not feasible then it is simpler to ensure the recruiters know who is on-call when, and that they are not scheduled for interviews while on-call.

Corollary: Budget for time “lost” to on-call work when project planning

Do: Make sure any project plans / expectations for how long a project will take properly account for the at-least-two-weeks anyone in the on-call rotation loses each quarter due to being on-call.

Why: Assuming a 6-person on-call rotation per site each person is going to be primary on-call at least twice in a quarter, maybe three times. That's a lot of time where they won't be working on their other projects.

It's important this is accounted for when estimating how long work will take.

This can be particularly pernicious when a project requires multiple people from the on-call team in order to complete it, and the project repeatedly stalls for a week because the next person on the critical path is now the primary on-call.

Have a written handover at the end of each shift

Do: At the end of each shift hand over responsibility to the next primary, ideally with a short log of anything that happened on the shift, reminders about any upcoming changes, and so on.

Why: It is important the next primary on-call has sufficient knowledge about the state of the system.

I like to use something that creates a permanent record. A shared document can work for this as a good starting point.

In previous teams we wrote a small web app that prompted the user for pertinent information, as well as querying other systems to pre-populate the handoff report with e.g., “Here are the tickets that were opened during this shift”.

The primary on the other site should positively acknowledge receiving the handover.

Have a weekly review / handoff meeting

Once a week the primary/secondary engineers are going to change.

As close to this time as possible you should have a handoff meeting, to ensure the incoming primary/secondary on-call engineers are aware of the state of production, any ongoing problems, and so on.

This is separate to any weekly “team meeting” that might exist (e.g., to discuss project work, incident trends, etc).

Invite / require other attendees — for eample, if you have a platforms team, or a networking team, attendance of one person from that team can be very helpful to provide additional context to discussions about incidents involving those teams, and they can talk through any upcoming work the on-call team needs to know about.

Typical agenda:

  • Review / recap of previous weeks pages and tickets
    • What happened? Is it likely to happen again? Does it need a postmortem? Does more investigation need to happen? Do any changes put in place to mitigate the issue need to be removed?
  • Review of upcoming planned production changes
    • Turning up new capacity? Deploying new network equipment? Turning down an old system? If it's planned, and has the potential to go wrong, discuss it here.
    • Obviously, this should not be the first time the team is hearing about the work
    • The on-call team has the opportunity to veto / delay any proposed work if they're not comfortable with the level of risk (e.g., two major changes happening at the same time). This power is vested in them because they carry the responsibility of the pager
  • Review of open issues from postmortems
    • Is the work to remediate the incident trigger underway / on-track?
    • Does anything need to be escalated?
  • Review silenced alerts
    • If you have the technical capability to silence some alerts, review the silences. Are they still needed? Will any of them expire unexpectedly in the middle of the next shift?
  • Verify access, pageability
    • Confirm the incoming primary/secondary have the necessary credentials to make any necessary changes to production.
    • Confirm a test page is received by the primary, and they can acknowledge it.

The team owns the on-call schedule

Do: Allow the team to modify the on-call schedule as necessary, swapping shifts, adjusting cover, and so on.

Why: I've seen an anti-pattern where a team's manager (or lead) decides any changes to the on-call schedule should be vetted or approved by them.

Do not do this. It adds no value to the process.

As long as there is a mechanism that accurately records who is primary/secondary at a particular time (so the alerts can be delivered correctly, and compensation can be calculated), let the team modify the schedule as necessary.

For example, if the primary discovers there's an afternoon where they are unable to be primary, it's up to them to work with the rest of the team to arrange cover — typically the secondary would step up, and someone else would take over the role as secondary. The team should be perfectly capable of doing this without a manager needing to approve any changes.

A significant part of what we do as programmers is manage complexity.

Broadly (and I'm simplifying), the complexity of the solution to a given problem is constant.

It's likely the solution will consist of multiple parts working together, and we decide which parts should be simple, with the knowledge that that will make other parts more complex.

Some examples:

  • Building a new service? Make it stateful and you've vastly simplified how it works. But scaling it and routing traffic to it is now a lot more complex because more of the system needs to know and share the state.

  • Building and running a multi-stage CI/CD pipeline? That's pretty complex. But now the steps the developers have to do to get code in to production is less complex.

  • Instrumenting your code to export relevant metrics and other information for debugging and troubleshooting? That's more complex than not doing it. But it makes future debugging and troubleshooting less complex.

  • Writing a piece of code that does many things with a lot of internal dependencies is fairly easy, but testing it is a lot harder. Breaking it in to smaller functions and using techniques like dependency injection can make the code more complex to write and for other programmers to follow, but is a lot easier to write tests for.

The general pattern is that you can make it less complex here, but the complexity over there goes up. It's up to you to make mindful decisions about how to trade off the complexity of different parts of the system.

Just like real-world counterparts — e.g., conservation of momentum — this only holds to your solution in isolation. Over time you can find that the complexity of your solution needs to increase, because external forces (e.g., changing business requirements) require the solution to change. Rarely (and happily) you'll discover that you can reduce the complexity of your solution because, e.g., some other system has implemented part of the solution, so you no longer have to.

I was recently the attempted victim of scam when I moved apartments in Zurich. I hope this write up of what happened and how I dealt with it will be useful to other people when the same scam is tried on them.

tl;dr

If you just want the quick takeaway, it's as follows:

  • I agreed with Ants Transport, a moving company, to move apartment, arranged through MOVU.
  • They took an inventory of the posessions to move
  • On the moving day they “discovered” their inventory was incomplete, and demanded I pay CHF 650 extra, in cash, to complete the move
  • MOVU refused to deduct this from the original bill
  • I get lawyers involved, MOVU eventually waives the full amount

The rest of this post goes in to more details about what happened.

Finding a moving company with MOVU

When I needed to move apartment last year I discovered MOVU. They're a marketplace for moving companies.

As a customer, you sign up on their site and post details about your upcoming move – where from, where to, how many boxes you need, whether there's an elevator in the old and new apartment, that sort of thing.

MOVU makes that available to different moving companies, who then put in an offer to do the move. As the customer you can read the offers with the prices, compare what's included in the price, and so on. Then you choose the company you want, and further dealings are with them.

This seemed quite useful, so I signed up with MOVU. Even before they make the details available they offer to come around and take an inventory of the apartment which they can forward on to the moving companies.

I took advantage of this, and a guy from MOVU showed up one day with an electronic questionnaire we completed while he was walking around the apartment taking pictures of the furniture and I was pointing out other things that would need to be taken care of – for example, filling holes in the walls when shelves and pictures were removed as part of the move.

Getting quotes

MOVU worked as advertised, and I started receiving quotes from various moving companies. Prices ranged from under CHF 2,000 to over CHF 8,000.

I eventually selected Ants Transport, who quoted CHF 3,070.

The quote from Ants included Disassembling / Assembling of furniture as an explicit line item on the quote (that they charge for). The quote does not include, or reference, any inventory of items to be moved, assembled, or disassembled.

Other parts of the quote do – for example (this being Switzerland) you pay to have lights removed from the ceiling, and the quote from Ants included this service, and explicitly listed the number of lights to be moved.

The quote even listed the precise number of holes in the wall that needed to be filled (15, if you're interested).

Run up to the move

In the run up to the move I embark on a period of decluttering. Some chairs, tables, bookshelves all go to friends or second-hand stores, and I donate a great many books (this was all on my own initiative, “Tidying up with Marie Kondo” hadn't come out at this point…).

I don't buy any new furniture in this period. So any inventory taken by Ants is out of date, but is out of date in their favour – it would list items, like bookshelves, I no longer own.

So when Ants arrive to do the move, there is less furniture present than they should expect.

Ants rescheduled

Moving day was originally Friday 21st September. At short notice, Ants rescheduled to Saturday 22nd.

Day of the move

The first half of moving day goes relatively smoothly.

Ants have arrived with a comparitively small van, so the move is going to require two trips.

They wait until they have loaded the first half of my posessions on to the van, and then say there's a problem. There's more furniture than they expected, and before they'll do anything else I have to commit to paying them an extra CHF 650.

In cash.

At this point:

  • I'm about a week away from the end of the tenancy at my current apartment. That's not enough time to find a replacement moving company

  • Ants are holding half of my posessions hostage on their van

I have very little choice but to agree to pay this at the end of the day.

The move continues, in to my new apartment.

The quote from Ants had included the use of floor protectors at the new apartment.

And when I say “new apartment” I mean brand new. The building didn't exist a few years ago, I'm the first occupier.

Ants did not bring any floor protectors with them.

At the end of the move, Ants demanded I sign a form saying they'd had to move additional furniture and used extra boxes (despite them not having done so), and then drove me to a nearby cash point to get the money. They refused to leave until I did this.

Day of the apartment handover

I arrive at the apartment to do the handover to the landlord.

The clean of the apartment had gone well, I have no complaints about that.

But there was additional work – chiefly, filling and painting over holes in the walls left from pictures and shelves – Ants had quoted for, included in their final price, and had not done.

I also discovered Ants had left some boxes there – I'd kept the box from a recent TV purchase because I knew I was going to be moving and figured it would be better to move the TV in its original box. Ants hadn't used it. They'd also left some other boxes as well.

Fortunately for me my former landlord was very understanding. Since they were having the apartment repainted anyway before renting it out again, they said they would have that work done as well and invoice me for it, and as the handover was technically a few days before my rental contract with them expired I was able to hire a Mobility car to take and store the boxes at my new apartment until I had time to dispose of them.

Informing MOVU

I informed MOVU of all of this, with:

  • photos showing the work that hadn't been done
  • the PDFs I'd been sent with the quote and the contract confirming all furniture was to be moved

I paid the MOVU bill, after deducting amounts for the CHF 650 already paid, the cost of filling the holes in the walls, the floor protection I'd paid for that wasn't used, and the cost of the Mobility car to move boxes Ants had left behind.

There then followed a long e-mail discussion with MOVU, during which they repeatedly contradicted themselves.

For example, they said only furniture listed on the inventory was included in the move (contrary to the wording on the contract I had). Of course, I asked them to send me the inventory. Repeatedly. They didn't – I assume because it doesn't exist.

They also tried to claim they were just an intermediary used to find Ants, and my actual contract was with Ants. And in a subsequent e-mail said I shouldn't have given Ants cash on the day, but should have instead called MOVU to resolve the issue.

Resolution

Fast forward a few months. MOVU (and their payment processor, BillPay) are sending increasingly urgent sounding e-mails demanding the remainder of the bill be paid (and late fees are stacking up).

And I'm sending e-mails to MOVU asking them to provide any sort of proof there was an inventory that listed the specific furniture that should be moved, which they are unable to provide.

So I engage the nice people at Streichenberg to start talking to MOVU on my behalf. MOVU appear to be unable to provide them with an inventory either, and offer to cut the remaining payment in half.

Since they still haven't provided any evidence they're not running a scam, I reject that offer.

And now I've just been informed MOVU have dropped the remainder of the bill, and BillPay have just sent me a “The account is cleared” message.

So, scam over. I win.

Summary

Based only on my experience, one or both of MOVU or Ants Transport are running a scam that works as follows.

  1. You put in a request for a moving company through MOVU.

  2. A MOVU employee (or someone representing themselves as a MOVU employee) arrives and takes an “inventory” of your furniture to be moved. Based on my experience, this inventory is a sham – to this day MOVU have not been able to provide me with a copy of the inventory, or anything proving I saw it before the move (e.g., a signed list of items)

  3. On the moving day, either (a) the moving company discovers they've been given incomplete information by MOVU, and have to do more work, or (b) MOVU and the moving company are collaborating to engineer this situation.

Whichever it is, once the moving company has your posessions hostage then they can demand more money to finish the work, knowing you have little choice but to agree.

As best as I can tell this is not the first time Ants and/or MOVU have engaged in this behaviour. At the time of writing https://www.umzugsfirmen-check.de/umzugsfirma/ants-transport/A2D8E61B/detail.html contains a review of their service in which the reviewer says “Before they left the movers stood in front of me and asked for more money because [of] … my heavy stuff. I just wanted them to go and I gave them money.”

https://www.ekomi.de/bewertungen-movu.html also contains many 1-star MOVU reviews in which reviewers are claiming they had to pay extra on cash on the moving day or immediately before.