The Poor UX/UI of Major Cloud Providers: ECS/VMs Management

The user experience of major cloud providers is surprisingly poor, especially when it comes to basic VM/instance management. Here are some major pain points I've encountered:

1. Complex Regional Segregation (AWS) - Instances are strictly segregated by regions - No unified view to manage instances across all regions by default (Need specific queries just to get a global view) - Sometimes you lose track of which region your instances are in

2. Instance Metadata Management is Unnecessarily Difficult - Can't easily rename instances (Google Cloud requires shutdown and restart) - No automatic tracking of which IAM account created the instance - Limited ability to add notes or tags for running services - When an instance runs multiple services (e.g., "service-A" actually running A, B, and C), there's no easy way to update its name or add proper documentation - Have to maintain external documentation to track what's running where - Need to SSH and check screen/tmux sessions to see what's actually running

3. SSH Key Management Issues (especially Google Cloud) - Can't properly manage SSH public keys - No ability to add descriptions to keys - Can only see a list of root public keys without context - No tracking of who added which key - No "last used" timestamp (unlike GitHub which handles this well)

I do use serverless solutions like Lambda/Cloud Functions and Firebase, but there are always cases where you need ECS/VMs. These basic UX issues make daily operations unnecessarily complicated.

What's your experience with cloud providers' UI/UX? Have you found any good solutions to these problems?

14 points | by Brandon_Chen 64 days ago

4 comments

  • elawler24 64 days ago
    This reminds me of Cloudflare's focus on the innovator's solution and how they justified competing against giants like AWS and Google. They created a new market focused on startup adoption, based on ease of adoption and more reasonable costs (free egress fees on R2 for example, to reduce data lock-in concerns).

    Detailed article from Stratechery here that's worth a read - https://stratechery.com/2021/cloudflares-disruption/

  • aristofun 64 days ago
    Always follow the money.

    There is no incentive to care about UX.

    Companies (i.e. executives, owners) are customers of cloud providers (not devops/devs) - they don't work in its UI => they don't care about it.

    b2b products virtually always have terrible UX for that reason, even in competitive markets (UX is rarely an important competitive advantage)

    • wildrhythms 64 days ago
      I work in front end and this is true. We have decades-old top feature requests, and we were told (by people far above my level) in January to only work on shipping a bunch of hairbrained AI features.
      • aristofun 64 days ago
        Also that is why frontend engs salaries are often lower, despite the job being generally harder
  • GenerWork 64 days ago
    I realize this wasn't your intent, but thank you for highlighting how important it is for companies to fund a UX team, and then allow said team to interact with customers to identify and fix issues like this.
  • deathanatos 64 days ago
    Oh, it's terrible.

    Azure, last I knew, required RSA SSH keys. (Attempting to give a new VM an ed25519 key just fails.)

    GCP's project switcher … like why can't I just search by project name? Instead, it only searches the current org, and unfortunately, my org uses multiple GCP orgs, so I'm all day switching between orgs in the project switcher.

    GCP: I have no idea what the deal with the sidebar is, or why it died. The search box is okay, but why can't services be just preloaded, and the search done locally? Nuanced search through docs might require an AJAX, okay, but "clusters" should match GKE instantly.

    GCP: IAM won't show inherited permissions if you don't have getIamPolicy on the thing that the perm inherits from. This can result in weird situations where, e.g., you have editor on a project, but it won't show up in IAM.

    GCP: I swear the API cannot decide if it's a "region" or a "location".

    GCP and Azure: various UI processes will create multiple objects, and it's not clear that that's what is happening. In the long run, this hampers a user understanding how the various bits all fit together.

    GCP: some objects require other objects. I hate hitting this, as GCP's UI will refuse to validate the form, as it sees the field with the dependent object being empty. But GCP's UI won't let you enter anything into the field, either, because it's a drop-down combo box, and it only loads its values when the page does. So even if you pop into a new tab, build the dependent object, you can't add that new object to the form without reloading the page, which wipes the form. This makes filling in the UI O(n²). Literally, I've had my RSI flare up on these forms when they have 3, 4 of these.

    Azure: the portal will redirect/reload like 8 times during log in. What on earth. Also, their "SSO" seems to have forgotten what the first "S" stands for. Then, 2FA requires unlocking a phone, and entering a code. Their 2FA app has a terrible delay between prompting you, and surfacing the keyboard. Add to that that the login sequence has a short timeout, and sometimes you can miss the window just trying to complete the 2FA flow fast enough.

    GCP is my current cloud, so while there's probably more quips here for them, it's just because I'm using them. Out of the big 3 (AWS, Azure, GCP) I actually think they're alright. Either #1 or #2.

    … most of these clouds have no material means of issue tracking, so quality issues shouldn't really be that shocking. There's no way to bug report it! (And no, Azure, some dinky "feedback portal" where I can upvote at most 3 bugs … is not a bug tracker. Customer support portal != a bug tracker, either.)