Security primer

Security banner: some programming code with a lock

Security is not what I normally write about. But then something really egregious happened, and since I am in IT… Well, I have stuff to say.

The cause

I don’t talk about my day job here that often. Well sometimes. But, in fact, I make most of my income as an IT professional (you thought this website paid my bills? Guess again). In fact, I was responsible for the login system of a high traffic website for seven years. I gave presentations on security at IT conventions, and prevented a number of security breaches over the years. So, I know a bit about the subject.

This week, I got an e-mail from Spoutible. You know, one of the social media that popped up after Musk took over Twitter. The email told me there had been a security incident and some user information had been exposed. But, and I quote “Decrypted passwords and direct messages were not disclosed.” Okay, well darn. That sucked, but it happens. I changed my password and thought that was that.

Then, the day after, an article from Troy Hunt appeared, explaining the details. And, well, they suck. Let me explain some basics about login security, then circle back to why this was a troubling breach.

Password security

When you log in to any website, you usually enter an e-mail address and password. The password you enter is compared to the password you entered when you created the account, and if they match, you can continue. Simple, right?

Well… until somebody manages to hack into the database of passwords. The IT community learned that the hard way a few decades back.

So, a new scheme was devised. And it’s pretty clever. It’s based on something called a ‘hash’. A hash function is a mathematical function that takes a certain input, and outputs a different set of characters based on that input. But not just any set of characters. If the input is the same, the output will always be the same too. If the input changes slightly, the output will also change. Most importantly: you cannot reverse the calculation. There is no way to get the input back from the output.

For example ‘P@ssword’, will result in an Bcrypt hash (Bcrypt is a specific hash function) of ‘$2a$12$GZg340WDqLkMwH7yagteYOfjkZyPi5uljDzTh4C97oWgaP1cDICtG’.

Back to passwords. Using a hash function, you can translate any password to a hash value. But, you cannot get the password back from those characters. Great for storing in a database. But… how do you then check the password at login? Well, the great thing is, when you create your password, you can store that hash. And when somebody enters the password, you simply calculate the hash from that password and compare it to the stored one. You can still match them, but don’t need to store the password.

Quick side note here, this is also how fingerprints are stored these days, so you can’t reverse engineer fingerprints from a biometric database.

Dictionary attacks

So, awesome, we can store passwords that you can’t decode. However, there is still a problem. If you have hacked the database, you can check hashes against it. A lot of them. So, instead of trying to reverse the hash, you can use a list of commonly used password (a dictionary), hash them, and see if they match an account. You’d be surprised how easy a lot of passwords are to guess. Doing a search on the hash above, for example, would find all accounts that used ‘P@ssword’ as a password.

There are ways to prevent this kind of attack too. You can use a so-called ‘salt’. A salt is random set of characters that you store. Usually one for the entire database (the static salt, stored separately from the database) and one for each account (the dynamic salt). Before you calculate the password hash, you stick the static and dynamic salt at the end. As long as you add the same salts at password creation and at login, it’ll work.

For example, say my static salt is ‘abc’ and my dynamic salt is ‘123’, then the system will calculate the hash for ‘P@sswordabc123’, which is ‘$2a$12$m0Gp…ovXEvi’. If a different account uses the same password, but has a dynamic salt of ‘345’, then the hash (for ‘P@sswordabc345’) will be ‘$2a$12$38CR…7YwsW’. Which is different. So, now, even different accounts with the same password have a different hash. That makes a dictionary attack much harder, and without the static salt, harder still.

MFA

So, let’s assume you did all that. Are you safe? Nope. You might be in the clear, but other sites might mess up. You regularly read about breaches of databases. And those databases are not always secured as described above. So, hackers might obtain combinations of e-mail addresses and passwords from other places. And, unfortunately, people re-use passwords (hint: don’t re-use passwords).

So, what hackers do, is fire off their list of stolen credentials from site A at uncracked site B, hoping something will stick. That’s called a credential stuffing attack. I’ll not go into more advanced techniques for countering those, but bottom line: you can’t protect yourself from this completely.

So, given that passwords might be hacked, what can you do? Well, to make an account even more secure, you can use MFA (multi-factor authentication). That is, use multiple ways of authenticating using different channels. A password is one channel. A second factor is often a text message to your phone number. But, it can be something else. A physical dongle. A letter sent to your home address (slow, yes, but it works). Or a special time-based code generated from a shared secret (Google Authenticator).

The Spoutible case

So, what happened at Spoutible? Well, they accidentally exposed their hashes. If you retrieved information about a user programmatically (using their well-documented public API), you got back a set of data about the user, which included their e-mail address (oops), their phone number (oops), and to top it off, the password hash (ouch).

Now, if you paid attention, you’ll be asking ‘did Spoutible use salts?’ And the answer is: no, apparently they did not. So, they were leaking hashes, which is the equivalent of putting your user database online. Still, a dictionary attack only works for people with simple passwords, right? Well… they also put the hashes online used for the multi-factor authentication. And that was based only a simple 6 digit code. And breaking a 6 digit code, even if hashed, is only a few minutes of work.

Oh, dear.

Indeed.

But, it gets even worse. They also exposed password reset codes. When you do a password reset, systems mail you a temporary code that you can use for 15 minutes or so, to reset your password. That can’t be salted, but it’s valid for a short time so stealing a database won’t work very long. And the code only ends up in the database and your e-mail. Unless it’s exposed within those 15 minutes. Which Spoutible did.

So, to summarize, Spoutible exposed their password database (bad), didn’t use salts (also bad), exposed their easily attackable MFA codes (worse), and leaked password reset codes (horrible). If you can simply query the reset code, you might as well not use passwords at all.

Fixed! right?

In fairness, Spoutible closed the leak. They did so in a few hours, which is actually a very quick and good response time. Hunt only put his article online after the leak was fixed, so hopefully the damage was limited.

However… that this happened in the first place is bad. I’ve seen (and prevented) my share of security issues, but I’ve not seen one this egregious. Exposing all this data. It’s really, really stupid.

In fairness, accidentally exposing too much info is a common error. If I had to guess what happened here, I’d say somebody used a mapping library to map their database record to their API output. You feed in the database record, and the JSON output (for the API) comes out. If you don’t put a filter on that, it will simply dumps the entire record, including stuff like reset codes and hases. And I think that’s what happened — but, note, that’s a guess.

Whatever happened, it’s understandable, but not excusable. There should be multiple layers of security and QA to prevent this from happening. Normally, people that can create code that accesses the user accounts should have basic security training. Code changes should always be reviewed by multiple people. You should have automated tests in place that deliberately check for mistakes like this. Don’t use such a one-on-one mapper in the first place. You should not expose your account records directly to an API like this. And these kinds of changes should be reviewed by a security officer or QA tester, or both.

In other words, three or four people messed up, or simply don’t exist. And the entire system might have some serious design flaws. That shouldn’t happen for a website with several hundred thousand users that wants to take on Twitter — sorry, MusX.

Conclusion

I deleted my Spoutible account. I was already uncomfortable with some of the things the founder said and did over the past year. And I was already not a frequent user. But you know, social media are tiresome, and they all seem to have people at the top I don’t really like (Musk for Twitter, Jack Dorsey for BlueSky, Zuckerberg for Threads/Facebook).

So, I am sorry for the users who were actively following me (yes, I saw you). I hope you can still find this site, or find me on BlueSky, Mastodon, or Facebook. I’m currently most active on BlueSky. Still, not very active.

We really can’t have nice things.

Martin Stellinga Written by:

I'm a science fiction and fantasy author/blogger from the Netherlands

Be First to Comment

    Leave a Reply

    Your email address will not be published. Required fields are marked *