How Web Apps Work: AJAX, APIs, and Data Transfer

This is a post in the How Web Apps Work series.


An overview of the concepts, terms, and data flow used in web apps: AJAX, JSON, APIs, and data transfer

Web development is a huge field with a vast array of concepts, terms, tools, and technologies. For people just getting started in web dev, this landscape is often bewildering - it's unclear what most of these pieces are, much less how they fit together.

This series provides an overview of fundamental web dev concepts and technologies, what these pieces are, why they're needed, and how they relate to each other. It's not a completely exhaustive reference to everything in web development, nor is it a "how to build apps" guide. Instead, it's a map of the territory, intended to give you a sense of what the landscape looks like, and enough information that you can go research these terms and topics in more depth if needed.

Some of the descriptions will be more oriented towards modern client-side app development with JavaScript, but most of the topics in the series are fundamental enough that they apply to server-centric applications as well.

Other posts in this series cover additional topics, such as:

New terms will be marked in italics. I'll link references for some of them, but encourage you to search for definitions yourself. Also, some of the descriptions will be simplified to avoid taking up too much space or dealing with edge cases. This post in particular does not attempt to be a "how to program" tutorial or complete reference to JS, but will point out common gotchas and differences from other languages like Java, C++, and Python.

The MDN JavaScript docs have a complete set of resources on everything related to JavaScript, AJAX, and browser functionality

The Modern JavaScript Tutorial has information on built-in AJAX APIs and browser storage.

My JavaScript for Java Developers slides also cover much of this post's content as well, showing examples of JS syntax and concepts in a cheatsheet-type format. I'll link relevant sections of these slides throughout this post rather than copy entire large code blocks.

Table of Contents

JSON

JSON Overview

JSON is the most widely used data transfer format for web apps. JSON is short for "JavaScript Object Notation", and is an adaptation of JS object/array syntax for use as a data format. All major programming languages have built-in support for reading and writing JSON data, but it naturally fits into JavaScript. JSON has become the de-facto standard data transfer format for HTTP-based APIs, and some databases even support storing and querying JSON document contents. (slides: JSON)

JSON syntax consists of normal JS objects and array definitions, including primitives like numbers, strings, booleans, and null. However, JSON does have some additional rules and restrictions compared to normal JS syntax:

  • No variables
  • No comments
  • All object keys must be quoted with double-quotes
  • The final key in objects and the final item in arrays may not have a trailing comma

The top level of JSON data may be either an object or an array (slides: (slides: JSON syntax examples))

A typical JSON document for a list of todo objects might look like:

[
  {"id": 0, "text": "Buy milk", "completed": false, "colors": ["red"]},
  {"id": 1, "text": "Clean yard", "completed": true, "colors": []},
  {"id": 2, "text": "Read book", "completed": false, "colors": ["red", "blue"]}
]

Because JSON is a subset of JS syntax, it lacks several things that normal JS has. The inability to write comments is a frequently cited pain point. It also doesn't have native support for date values, so they have to be serialized as strings. You can write numbers in JSON of any size, but because JavaScript itself is limited to 64-bit floating point numbers, in practice that also limits the size of numbers you can pass via JSON.

Working with JSON

JSON content must be serialized by converting it into a single string for transfer, and then deserialized back into actual objects after it has been received.

The JS language supports this with a pair of built-in functions: JSON.stringify(value) will convert any valid value (object, array, primitive) into a complete string. JSON.parse(string) will deserialize a string of JSON text back into its corresponding values.

When sending JSON content over HTTP, the request or response should use the header content-type: application/json.

Most server frameworks have shortcuts built in for returning a JSON-formatted response. These methods often allow returning an object or array directly, and the framework takes care of serializing the data into JSON format and setting the right HTTP headers. For example, the Express server framework for Node has res.json(obj).

AJAX

AJAX is a term used to describe making HTTP requests, via JavaScript, in the background of a page, without reloading it. It's an acronym that originally stood for "Asynchronous JavaScript And XML". At the time the phrase was created, XML was the standard data transfer format used for most requests. Since then, JSON has completely replaced XML as the standard data transfer format. Today, the term "AJAX request" normally refers to fetching data in JSON format from a web server over HTTP.

Browsers have a couple built-in APIs that are used for making AJAX requests: XMLHttpRequest (XHR) and fetch. XHR is generally considered to be a clunky, low-level, and hard to use API design. fetch is intended to be a more modern replacement, but still has some noticeable annoyances in its design. Because of those, most apps use a third-party library that wraps one of those two built-in methods to provide an alternate API that is easier to use.

Note: The term "API" always means "other code that you call to get something done", but depending on context, it can mean "an actual function you call in your code", or "a remote request to a server to fetch some data". In the "AJAX" section, "API" refers to actual functions you call in your JS code. In the Data Transfer Protocols section below, it refers to remote network requests to fetch data.

Browsers can cache some HTTP response contents. This is most common with static files like HTML, CSS, and images, but some HTTP requests can be cached as well. This is most likely to happen if the HTTP request is using the HTTP GET method. Several HTTP headers also have an effect on caching as well.

AJAX APIs and Libraries

AJAX API usage typically falls into three categories:

  • The built-in XMLHttpRequest and fetch APIs
  • Third-party wrapper libraries like axios and jQuery that wrap the core APIs
  • Higher-level data fetching abstractions that use the other APIs internally

There are many third-party AJAX wrapper libraries, so we'll briefly look at a couple of the most popular options here.

XMLHttpRequest

XMLHttpRequest, typically referred to as "XHR", is the original browser API for making AJAX requests. It was introduced by Internet Explorer around 1999, and added by other browsers around over the next few years. Despite its name, it can be used to transfer any kind of content over HTTP, not just XML.

XHR's API is based on creating an XHR object instance, and assigning callbacks to it for various request lifeycle events. That approach is notoriously hard to work with. (slides: XHR examples)

XHR is supported in all browsers, and in Node.js. Because it's available everywhere and has been around longer, it's often used as the core implementation of other AJAX libraries.

fetch

The fetch API is a newer built-in browser API, meant to be a replacement for XHR. It has a nicer syntax, and uses Promises for handling the responses. (slides: fetch examples)

A typical fetch request might look like:

fetch("http://example.com/api/users")
  .then(response => response.json())
  .then(users => {
    // do something with users data
  });

However, fetch has some behaviors that frequently surprise or annoy developers:

  • fetch only returns a rejected promise if the actual network request failed completely. As long as the server returns a response, fetch will return a successfully resolved promise, even if the HTTP status code was a 4xx or 5xx HTTP error.
  • The initial fetch('/some/url') call returns a promise containing a Response object. That object is not the actual serialized data from the server. Instead, a second call is required to ask the response to return the request body content in a specific format, such as JSON.
  • fetch does not send cookies to the server by default - you must include a credentials: 'same-origin' option
  • fetch does not automatically stringify objects or set headers when sending JSON - you must do those steps yourself

Also, fetch is not supported in Internet Explorer or Node.js.

Because of this, it's very common to use a library that wraps XHR or fetch. For cases where you want to use fetch but it's not available in your target environment, there are polyfills to add a reimplementation of fetch in environments that don't support it, such as isomorphic-fetch.

Axios

Axios is a widely popular AJAX wrapper library. Its API design is based on the "Resource" classes that were part of AngularJS 1.x, particularly the structure of its response objects.

Axios makes common tasks fairly easy. The default axios object has .get() and .post() methods, which return promises. The .post() method accepts a data value as an argument, and if provided, automatically stringifies that into JSON and adds headers to the request. Axios response objects automatically deserialize JSON content, and the body content is always available in response.data. A typical request might look like:

axios.get("http://www.example.com/api/users")
  .then(response => {
    // do something with response.data
  })

Because it uses XHR internally, it works across all browsers as well as Node.

Axios also provides some powerful capabilities like "interceptors", which allow modification of requests and responses. This is often used for adding auth headers or transforming data.

jQuery

jQuery was historically used to enable powerful DOM manipulations, while smoothing over DOM API differences between browsers. jQuery also includes an AJAX wrapper, $.ajax(). Given the difficulties in working with the plain XHR API, jQuery's AJAX wrapper was another major reason for its popularity. (slides: jQuery AJAX)

The jQuery AJAX API is nicer than XHR, but definitely older. It primarily takes success and error callbacks, rather than working with promises. It does have the ability to handle responses by returning a precursor to Promises, called a "deferred", and can automatically deserialize JSON data.

If you're working on a simple site and already are using jQuery for other things, it's reasonable to use jQuery to make AJAX calls. Otherwise, prefer using a modern option like fetch or axios. Definitely do not add jQuery to a site just to make AJAX calls.

Other Request Libraries

Many tools and frameworks come with their own HTTP request wrappers. For example, AngularJS 1.x had $http for general requests and a higher-level $resource class for working with REST APIs, while modern Angular has an HttpClient service.

Since the fetch API has several annoyances in its design, many users opt to write their own fetch wrapper to customize the behavior, typically by auto-converting JSON data and rejecting promises on HTTP 4xx/5xx responses.

Alternately, redaxios is a reimplementation of much of the axios API as a small wrapper around fetch.

HTTP Request Variations

HTTP is inherently a request/response protocol. That means that a server can never initiate sending a message to a client by itself. However, there are many cases when it's necessary for a server to send data to a client based on an event that occurred in the server.

Over the years, the community has come up with several workarounds. The modern solution to this problem is Websockets, but it's useful to know about these other techniques as well.

Polling

Polling is the simplest option to implement. The client continually makes requests to the server every few seconds, on a timer. However, this is usually inefficient - most of the time the server won't have anything new to report, so the new request is a waste of time and bandwidth.

Long Polling

Long polling is a variation on the polling approach. Instead of continually making requests, the client opens one request to the server. If the server doesn't have any messages to report, it sleeps the request. The client still has the connection open, and continues to wait for a response from the server. Eventually, the server either has something new to report, or a timer expires. At that point, it wakes up the request, and finally sends back a response to the original open request connection.

The client then immediately makes another request, and the server again either has something to send back immediately, or sleeps on the request.

This approach requires a bit of coordination inside the server, but is more bandwidth-efficient.

HTTP Streaming

Once an HTTP connection is opened by the client, the server may continue to send data over the open connection without closing out the response, leaving it open indefinitely.

Server Sent Events

Server-Sent Events are a specific form of HTTP streaming that is actually part of the HTTP spec. The client creates an EventSource object and adds an event listener callback, and the server writes specifically formatted content to the response over time.

Websockets

All of those HTTP-based approaches to 2-way message and data transfer have distinct limitations. Websockets were created to address those limitations. Websockets allow opening a persistent connection between a client and a server, where either side may send messages over the connection at any time. (slides: websockets)

A websocket starts out as a standard HTTP request, but the client adds a header telling the server it wants to "upgrade" the connection to a websocket. A websocket is similar conceptually to an actual OS-level networking socket, and like all HTTP requests, the data is sent over an actual socket. But, it's a higher-level abstraction, and is much more limited API-wise compared to a low-level OS socket.

Once created, either side can add event listeners to process received messages, and send messages over the connection. Messages can contain text or binary content.

Today, websockets are the standard approach used to provide live updates to a page after it's been updated, such as updates to sports scores.

CORS

HTTP requests often involve security and authorization concerns. As part of this, browsers specifically implement restrictions such as the Same-Origin Policy, so that scripts can only interact with resources loaded from the exact same server URL. Additionally, a server may only want to allow interactions with its own scripts and client code that it has served, instead of requests from any arbitrary site or client.

Browsers enforce this via Cross Origin Resource Sharing (CORS). By default, an AJAX request to any URL other than the original URL for a host page will cause the browser to make a pre-flight request to ask the server "are you okay with other sites making data requests to you?". The server may then respond with a set of URL patterns that are allowed to successfully make requests, or a value indicating "all requests are okay". If the current client matches that pattern, the browser will actually make the real request.

CORS is a frequent source of confusion for developers who are just trying to make a request and are surprised when the browser blocks it.

Note that only browsers implement CORS - HTTP requests made outside a browser, such as from another server or another non-browser environments do not enforce CORS. This can also be confusing - "it worked in my CLI tool / API test client, why doesn't it work in the browser?".

Data Transfer Protocols

HTTP provides the standard mechanism for making some kind of request and response to a server. However, there are many ways that a server may structure its URLs and the expected format of requests and responses. In order to write code that makes requests to a server, you must know what structure and request formats that server expects your client to use.

There are numerous variations on how server data APIs are structured, but there are a few common categories of API structures you'll see frequently. (slides: AJAX API design approaches)

With all these approaches, the server defines specific endpoints - combinations of URLs, HTTP methods, and expected request/response formats.

REST

REST stands for "Representational State Transfer". A REST API is an HTTP-based API that uses URLs and HTTP methods as the primary approach for determining what kind of request the client is making, and how the server should handle that request.

REST APIs are often used for CRUD (Create/Retrieve/Update/Delete) apps, where the app is conceptually doing operations to update data in a database on the server ("create this thing", "give me a list of these things", "update that thing", "delete that thing").

There are frequent arguments over what exactly constitutes a true REST API, but in general, a REST API:

  • Offers different URLs for each kind of "resource" that it supports
  • Expects different HTTP methods to be used for different kinds of operations on the same resource URL. Typically, HTTP PUT/GET/POST/DELETE map to relevant CRUD operations.
  • Returns different HTTP status codes to indicate success or failure of the request
  • Expects key pieces of data like item IDs to be included as part of the actual URL
  • Reports the success or failure of the request via different HTTP status codes

Example REST API usage might look like:

  • Get list of users: GET /users
  • Get one user: GET /users/42
  • Create one user: POST /users/42 (body: {name : "Mark"})
  • Update user: PUT /users/42 (body: {name : "Mark"})

Many server web frameworks have built-in support for defining a set of REST API endpoints based on metadata.

RPC

RPC stands for "Remote Procedure Call". This is a general style of network request that is intended to mimic a normal function call in your code, but with the actual logic running on some other server. There have been many RPC tools over the years for many languages. Here, the focus is on RPC-style HTTP requests.

An RPC-style approach describes "methods" to be "called", instead of "resources" to be "operated on". Unlike a REST API, an RPC-style API will likely define URLs whose paths look like verbs or function names instead of nouns.

For example, an HTTP-based RPC server approach might look like:

  • Get a list of users: GET /getUsers
  • Get one user: POST /getUser (body: {userId : 42})
  • Create one user: POST /createUser (body: {userId : 42, name : "Mark"})
  • Update one user: POST /updateUser (body: {userId : 42, name : "Mark"})

Alternately, there might be a single URL endpoint, and the request could have the name of the "method to call" in the body. The JSON-RPC spec uses this approach:

  • POST /jsonrpc (body: {"method": "createUser", userId : 42, name : "Mark"})

Either way, the response is likely to have an HTTP 200 status code indicating that the server returned a response successfully, but the contents of the response might have some kind of a field indicating whether the actual operation succeeded, like {status: 'failed'}.

GraphQL

GraphQL is a relatively recent data transfer protocol created by Facebook. Conceptually, GraphQL is a defined format for API request and response content. In addition, there is an ecosystem of client and server tools that are typically used that abstract the details of making a GraphQL-formatted query, processing the request on the server and fetching requested data, formatting the response correctly, and caching the data on the client. Most discussions of "using GraphQL" assume use of the most popular libraries and tools for working with GraphQL.

With GraphQL, the server offers a single URL endpoint, and defines schemas that describe what data types it supports and how they relate to each other. The client sends a GraphQL query request that asks for some subset of those data types, and defines the expected structure of the response format. The server then resolves the requested data types, extracts the requested fields for each type, formats the response as requested by the client, and sends the data back. The server may also support mutations, which allow the client to create/update/delete data, and subscriptions for changes to query results over time.

GraphQL requests are almost always HTTP POSTs, and subscriptions are typically built on top of websockets.

GraphQL query syntax looks somewhat like JSON, but without commas or values:

{
  hero {
    name
    friends {
      name
    }
  }
}

which might result in this JSON response from the server:

{
  "data": {
    "hero": {
      "name": "R2-D2",
      "friends": [
        {"name": "Luke Skywalker"},
        {"name": "Han Solo"},
        {"name": "Leia Organa"}
      ]
    }
  }
}

GraphQL moves a lot of the query complexity to the server, and gives clients a lot of flexibility in what data they want to ask for. This can simplify cases where different portions of an app need to fetch different subsets of the same data and reduce the need to make numerous calls to fetch related values. On the other hand, it can be more difficult to deal with things like auth and business logic as part of the data resolution on the server, and the most popular GraphQL clients are very heavyweight.

Browser Storage

In addition to all these network requests for data fetching, browsers also offer several tools for storing data within the browser itself. These are typically used to manage data related to user sessions and preferences, but can be used for other scenarios as well.

Cookies

Browsers allow servers to set cookies - small pieces of text associated with a given URL. Any cookies set by a site will then be included on future requests to that site. Cookies are typically limited to a couple KB in length, and are most often used to store a unique session ID that can be read by the server to retrieve additional session data from a database or in memory. Cookies can have expiration timestamps set by the server.

Cookies can be a source of security issues. A server can add an HttpOnly attribute when setting a cookie to ensure that JS code cannot read or modify that cookie.

localStorage

localStorage is a browser API that lets client code save key/value string data, then retrieve it later. localStorage persists indefinitely, so it's often used to save user settings so they can be reloaded the next time the user visits a page. An example of this might look like localStorage.setItem('userPrefs', JSON.stringify(userData)), and then reversing the process when a page is loaded. Data in localStorage can only be read by code from the same origin, and is limited to about 2MB in size.

sessionStorage

sessionStorage has the same basic API methods as localStorage, but is isolated on a per-tab basis. It will persist between tab reloads, but is cleared when the user closes the tab.

indexedDB

A more powerful database-style storage system that can hold larger amounts of data.

Client URL Routing

In HTTP and Servers, we talked about the idea of routing: determing application behavior based on the URL. For application servers, routing means looking at a URL and HTTP method, and choosing the right application handler logic to process that request.

Client-side applications can do their own form of routing. Browsers offer access to the current URL via window.location, and the browser's URL history via a history object.

Client applications can parse the URL and use that as the basis for dynamically showing and hiding portions of the UI, without needing to make an additional request to the server. Client-side routers typically offer special forms of a link tag that handle clicks by calling history.push('/new-route') to alter the URL, and then read the updated route to switch to showing a different piece of UI.

However, client routing does add additional complications. If the browser tries to reload a client-routed URL by itself, the server now has to handle that URL somehow with a valid response. Typically, the server is configured so that if a URL like /some-route is not recognized, it returns the entire index page with the full JS bundle as the response, just as if the browser requested the / URL. Once the content is loaded, the client router kicks in, sees that /some-route is the active URL, and immediately shows the relevant content on the client side.

Further Resources


This is a post in the How Web Apps Work series. Other posts in this series: