Confidentiality in the information age

If you use services like those of Google and Microsoft, (almost) all your data is (for them) in cleartext. It doesn’t have to be this way.

Aug. 12, 2024

Igino Corona

In the previous article we defined what information security is and outlined the main requirements on which it is based. In particular, we have defined confidentiality as the requirement for which information is accessible only to authorized entities. This requirement is undoubtedly what most distinguishes cybersecurity from other security issues. Think for example of a road bridge: if integrity and availability requirements are pretty obvious (and fundamental), guaranteeing its confidentiality makes much less sense.

One way or another, in order to guarantee any other information security requirement, we will have to keep some (other) information confidential.

In fact, we are in front of a recursive mechanism:

to guarantee data confidentiality, as well as all other security requirements, entities must be authenticated;
the authentication mechanism is generally based on confidential information, held (or known) only by the corresponding physical entity.

Information confidentiality, in other words, can only be converted into another confidentiality problem that is more easily manageable and more difficult for an attacker to violate.

Authentication credentials may be based on confidential information such as the classic username and password, numbers (PIN), numerical sequences, but also cryptographic keys (e.g. generated by the apps on your smartphone to uniquely identify it), and even biometric information (think to building a clone of your fingerprint). A confidentiality violation in such data may cause a violation of authentication mechanisms that in turn may cause the loss of confidentiality of any other information managed by the systems based on such authentication mechanisms.

Cryptography in the web

Encryption is the clearest example of conversion from one confidentiality problem into another. A message may be encoded (encrypted) using a cryptographic key, so that it is extremely difficult to recover the original message without knowing this key (or its dual, in the case of asymmetric encryption). The difficulty is typically associated with computational power needed to solve a mathematical decryption problem without knowing the key. By the way: Quantum computing will be able to revolutionize the matter, making current mechanisms obsolete/insecure, but at the same time offering encryption solutions that are (in principle) impossible to circumvent.

Encryption essentially allows one to provide guarantees on the confidentiality of a message. The problem is therefore to protect the confidentiality of the decryption key.

Cryptography can also be applied in the absence of cryptographic keys, through hashing, i.e. a non-invertible function that unambiguously translates one message into another from which, however, it is not possible to unambiguously obtain the original. Hashing therefore introduces a loss of information and is widely used for the conversion and verification of confidential information such as passwords.

Nowadays, encryption is fundamental to guarantee the confidentiality of data in transit on the Internet, and in particular web traffic, via the HTTP Secure protocol (HTTPS). HTTPS is essentially HTTP communication that occurs through an encrypted channel, in which clients and web servers can authenticate each other via cryptographic controls.

What typically happens is that the web client authenticates the web server by verifying that the HTTPS certificate is signed by one of the Certificate Authorities (CAs) it trusts, of which it has a list. Compromising even just one of these CAs can compromise the entire authentication mechanism of any server/web service. For example, you can consult the list of trusted CAs for Firefox or Safari.

But encryption also plays a key role in web applications that offer the highest standards of confidentiality to their customers. In the vast majority of cases, web service providers (such as Google, Microsoft and to some extent Apple) can access all their customers’ data/documents, and this may seem obvious at first glance.

It doesn’t have to be this way.

Fortunately, there are more and more services based on so-called end-to-end encryption, which make sure that no one other than the user/owner of the data, not even the service provider, is able to access the data in cleartext.

Data encryption in this case typically occurs via JavaScript-based client components of the web application, before the data is sent to the server, which does not know the key and only handles encrypted data. Notable examples of such commercial services are ProtonMail and Tresorit (both swiss-based enterprises), but there are also open-source web applications (and many free online instances) such as KeyBase and CryptPad.

Security through Obscurity and Information Minimization Principle

Achieving security by hiding information, or relying on confidentiality, is often referred to as Security through Obscurity. This is clearly not a recommended practice if it represents the only way in which information security is guaranteed. On the other hand, our previous discussion clearly highlights that information security must necessarily rely on something confidential, so we could say that “security through obscurity” is also somehow inevitable.

In addition, there may be information that does not need to be confidential, but is useful to increase the efforts of an attacker motivated to violate the system that manages it. This in turn can reduce the likelihood of an attack being successful, and therefore reduce the associated risk, because attackers also have limited resources and knowledge. This includes all the information that systems expose during their interactions, which is not useful for their functioning, but very useful for an adversary to carry out attacks against them. For example, HTTP messages exchanged by clients and servers typically include headers that precisely indicate their version, the installed components, as well as operating system version in which they are running, easily allowing any attacker or (ro)bot to understand if there are known vulnerabilities for them and perhaps readily available public exploits to run.

We can therefore conclude that “security through obscurity” overall is a useful mechanism to increase the cost of an attack (and therefore reduce its probability), but always in combination with best security practices (well, actually best security practices always include also this mechanism as part of hardening). The mechanism is analogous to the data minimization principle, which is a key best practice for the protection of personal data (see for example, the European regulation GDPR).

Confidentiality in the OWASP TOP 10

In the OWASP TOP 10 2021, confidentiality requirement is (partially) covered by the Cryptographic Failures issue, ranked second.

Fall in this category: absence of (or incorrect) categorization of sensitive data, absence of encryption mechanisms for the protection of data in transit and at rest, implementation errors in cryptographic algorithms, use of algorithms with known vulnerabilities.

Please note that in the CWEs list associated with this OWASP TOP 10, there is one (CWE-296: Improper Following of a Certificate’s Chain of Trust) that can be more closely associated with the authentication requirement, as the main weakness is associated with the HTTPS authentication mechanism. This aspect is not surprising: as we have discussed, authentication mechanisms themselves typically need to rely on confidential information.

It should now be clear that confidentiality is only partially covered by the OWASP TOP 10. In particular, there is no mention to the data minimization principle (or equivalently Security Through Obscurity) useful for reducing the risk of security breaches. More importantly, we have seen that encryption is essentially a mechanism for converting one confidentiality problem into another. Data confidentiality fundamentally depends on the level of abstraction at which encryption takes place, but this aspect is not covered by the OWASP TOP 10. Web applications in which data encryption takes place inside the user machine (end-to-end encryption) are generally capable of offering stronger privacy guarantees.

Find more blog posts with similar tags