Encryption powers the modern internet. Without the ability
to exchange data packets privately and securely, e-commerce would not exist,
and users wouldn’t be able to safely authenticate themselves to internet sites.
The HyperText Transfer Protocol Secure is the most widely used form of
encryption on the web. Web servers and web browsers universally support HTTPS,
so the developer can divert all traffic to that protocol and guarantee secure
communication for their users. A web developer who wants to use HTTPS on their
site needs only to obtain a certificate from a certificate authority and
install it with their hosting provider. The ease with which you can get started
using encryption belies the complexity of what is happening when a website and
user agent interact over HTTPS. Modern cryptography—the study of methods of
encrypting and decrypting data—depends on techniques developed and actively
researched by mathematicians and security professionals. Thankfully, the
abstracted layers of the Internet Protocol mean you don’t need to know linear
algebra or number theory to use their discoveries. But the more you understand
about the underlying algorithms, the more you will be able to preempt potential
risks.
Encryption in the Internet Protocol
Recall that messages sent over the internet are split into
data packets and directed toward their eventual destination via the
Transmission Control Protocol (TCP). The recipient computer assembles these TCP
packets back into the original message. TCP doesn’t dictate how the data being
sent is meant to be interpreted. For that to happen, both computers need to
agree on how to interpret the data being sent, using a higher-level protocol
such as HTTP. TCP also does nothing to disguise the content of the packets
being sent. Unsecured TCP conversations are vulnerable to man-in-themiddle
attacks, whereby malicious third parties intercept and read the packets as they
are transmitted. To avoid this, HTTP conversations between a browser and a web
server are secured by Transport Layer Security (TLS), a method of encryption
that provides both privacy (by ensuring data packets can’t be deciphered by a
third party) and data integrity (by ensuring that any attempt to tamper with
the packets in transit will be detectable). HTTP conversations conducted using
TLS are called HTTP Secure (HTTPS) conversations. When your web browser
connects to an HTTPS website, the browser and web server negotiate which
encryption algorithms to use as part of the TLS handshake—the exchange of data
packets that occurs when a TLS conversation is initiated. To make sense of what
happens during the TLS handshake, we need to take a brief detour into the
various types of encryption algorithms. Time for some light mathematics!
Encryption Algorithms, Hashing, and Message Authentication
Codes
An encryption algorithm takes input data and scrambles it by
using an encryption key—a secret shared between two parties wishing to initiate
secure communication. The scrambled output is indecipherable to anyone without
a decryption key—the corresponding key required to unscramble the data. The
input data and keys are typically encoded as binary data, though the keys may
be expressed as strings of text for readability. Many encryption algorithms
exist, and more continue to be invented by mathematicians and security
researchers. They can be classified into a few categories: symmetric and
asymmetric encryption algorithms (for ciphering data), hash functions (for
fingerprinting data and building other cryptographic algorithms), and message
authentication codes (for ensuring data integrity).
Symmetric Encryption Algorithms
A symmetric
encryption algorithm uses the same key to encrypt and decrypt data. Symmetric
encryption algorithms usually operate as block ciphers: they break the input
data into fixed-size blocks that can be individually encrypted. (If the last
block of input data is undersized, it will be padded to fill out the block
size.) This makes them suitable for processing streams of data, including TCP
data packets. Symmetric algorithms are designed for speed but have one major
security flaw: the decryption key must be given to the receiving party before
they decrypt the data stream. If the decryption key is shared over the
internet, potential attackers will have an opportunity to steal the key, which
allows them to decrypt any further messages. Not good.
Asymmetric Encryption
Algorithms In response to the threat of decryption keys
being stolen, asymmetric encryption algorithms were developed. Asymmetric
algorithms use different keys to encrypt and decrypt data. An asymmetric
algorithm allows a piece of software such as a web server to publish its
encryption key freely, while keeping its decryption key a secret. Any user
agent looking to send secure messages to the server can encrypt those messages
by using the server’s encryption key, secure in the knowledge that nobody (not
even themselves!) will be able to decipher the data being sent, because the
decryption key is kept secret. This is sometimes described as public-key cryptography:
the encryption key (the public key) can be published; only the decryption key
(the private key) needs to be kept secret.
Hash Functions
Related to encryption algorithms are cryptographic hash
functions, which can be thought of as encryption algorithms whose output cannot
be decrypted. Hash functions also have a couple of other interesting
properties: the output of the algorithm (the hashed value) is always a fixed
size, regardless of the size of input data; and the chances of getting the same
output value, given different input values, is astronomically small. Why on
earth would you want to encrypt data you couldn’t subsequently decrypt? Well,
it’s a neat way to generate a “fingerprint” for input data. If you need to
check that two separate inputs are the same but don’t want to store the raw
input values for security reasons, you can verify that both inputs produce the
same hashed value.
Message Authentication Codes
Message authentication code (MAC) algorithms are similar to
(and generally built on top of) cryptographic hash functions, in that they map
input data of an arbitrary length to a unique, fixed-sized output. This output
is itself called a message authentication code. MAC algorithms are more
specialized than hash functions, however, because recalculating a MAC requires
a secret key. This means that only the parties in possession of the secret key
can generate or check the validity of message authentication codes. MAC
algorithms are used to ensure that the data packets transmitted on the internet
cannot be forged or tampered with by an attacker. To use a MAC algorithm, the
sending and receiving computers exchange a shared, secret key—usually as part
of the TLS handshake. (The secret key will itself be encrypted before it is
sent, to avoid the risk of it being stolen.) From that point onward, the sender
will generate a MAC for each data packet being sent and attach the MAC to the
packet. Because the recipient computer has the same key, it can recalculate the
MAC from the message. If the calculated MAC differs from the value attached to
the packet, this is evidence that the packet has been tampered with or
corrupted in some form, or it was not sent by the original computer. Hence, the
recipient rejects the data packet.
The TLS Handshake TLS uses a combination of cryptographic
algorithms to efficiently and safely pass information. For speed, most data
packets passed over TLS will be encrypted using a symmetric encryption
algorithm commonly referred to as the block cipher, since it encrypts “blocks”
of streaming information. Recall that symmetric encryption algorithms are
vulnerable to having their encryption keys stolen by malicious users
eavesdropping on the conversation. To safely pass the encryption/decryption key
for the block cipher, TLS will encrypt the key by using an asymmetric algorithm
before passing it to the recipient. Finally, data packets passed using TLS will
be tagged using a message authentication code, to detect if any data has been
tampered with. At the start of a TLS conversation, the browser and website
perform a TLS handshake to determine how they should communicate. In the first
stage of the handshake, the browser will list multiple cipher suites that it
supports. Let’s drill down on what this means
Cipher Suites
A cipher suite is a set of algorithms used to secure
communication. Under the TLS standard, a cipher suite consists of three
separate algorithms. The first algorithm, the key-exchange algorithm, is an
asymmetric encryption algorithm. This is used by communicating computers to
exchange secret keys for the second encryption algorithm: the symmetric block
cipher designed for encrypting the content of TCP packets. Finally, the cipher
suite specifies a MAC algorithm for authenticating the encrypted messages.
Let’s make this more concrete. A modern web browser such as Google Chrome that
supports TLS 1.3 offers numerous cipher suites. At the time of writing, one of
these suites goes by the catchy name of ECDHE-ECDSAAES128-GCM-SHA256. This
particular cipher suite includes ECDHE-RSA as the key-exchange algorithm,
AES-128-GCM as the block cipher, and SHA-256 as the message authentication
algorithm. Want some more, entirely unnecessary, detail? Well, ECDHE stands for
Elliptic Curve Diffie–Hellman Exchange (a modern method of establishing a
shared secret over an insecure channel). RSA stands for the Rivest–Shamir–
Adleman algorithm (the first practical asymmetric encryption algorithm,
invented by three mathematicians in the 1970s after drinking a lot of Passover
wine). AES stands for the Advanced Encryption Standard (an algorithm invented
by two Belgian cryptographers and selected by the National Institute of
Standards and Technology through a three-year review process). This particular
variant uses a 128-bit key in Galois/Counter Mode, which is specified by GCM in
the name. Finally, SHA-256 stands for the Secure Hash Algorithm (a hash
function with a 256-bit word size)
Session Initiation
Let’s continue where
we left off. In the second stage of the TLS handshake, the web server selects
the most secure cipher suite it can support and then instructs the browser to
use those algorithms for communication. At the same time, the server passes
back a digital certificate, containing the server name, the trusted certificate
authority that will vouch for the authenticity of the certificate, and the web
server’s encryption key to be used in the keyexchange algorithm. Once the
browser verifies the authenticity of the certificate, the two computers
generate a session key that will be used to encrypt the TLS conversation with
the chosen block cipher. (Note that this session key is different from the HTTP
session identifier discussed in previous chapters. TLS handshakes occur at a
lower level of the Internet Protocol than the HTTP conversation, which has not
begun yet.) The session key is a large random number generated by the browser,
encrypted with the (public) encryption key attached to the digital certificate
using the key-exchange algorithm, and transmitted to the server.
Enabling HTTPS
Securing traffic for your website is a lot easier than
understanding the underlying encryption algorithms. Most modern web browsers
are selfupdating; the development teams for each major browser will be on the
cutting edge of supporting modern TLS standards. The latest version of your web
server software will support similarly modern TLS algorithms. That means that
the only responsibility left to you as a developer is to obtain a digital
certificate and install it on your web server. Let’s discuss how to do that and
illuminate why certificates are necessary.
Digital Certificates
A digital certificate (also known as a public-key
certificate) is an electronic document used to prove ownership of a public
encryption key. Digital certificates are used in TLS to associate encryption
keys with internet domains (such as example.com). They are issued by
certificate authorities, which act as a trusted third party between a browser
and a website, vouching that a given encryption key should be used to encrypt
data being sent to the website’s domain. Browser software will trust a few
hundred certificate authorities— for example, Comodo, DigiCert, and, more
recently, the nonprofit Let’s Encrypt.
0 Comments:
Post a Comment
If you have any doubts . Please let me know.