Codebase Conversion: Migrating a MEAN AngularJS app to React, Next.js, and TypeScript

Posted on Dec 24, 2021

#react #javascript #cra #mean #react #next #typescript

This is a post in the Codebase Conversion series.

A detailed look at my long process to migrate a classic MEAN app from AngularJS to React, Next, and TypeScript

Intro 🔗︎

In March of 2020, I was reassigned to lead a team building an internal metrics dashboard. This app was built as a classic MEAN project, with a client built using AngularJS 1.x. After I joined the team, I wrote a post recapping how I modernized the codebase by building the AngularJS client with Create-React-App. I also switched package management from NPM+Bower over to Yarn 1.x, switched client tests from Karma to Jest, and was able to use react2angular to begin building new features using React + TypeScript in the middle of the legacy client.

Since then, we've continued to modernize all areas of the codebase, and this week we hit a major milestone: we successfully shipped a fully migrated app client built with React, TypeScript, Redux Toolkit, and Next.js! We've done this as an ongoing incremental migration process, while continuing to build new features, fix bugs, and respond to user feedback.

This post recaps the migration process and the techniques that we used along the way. Some of the pieces I describe are specific to our own app, but many of these techniques and concepts are applicable to other apps doing a migration from legacy codebases to a modern React+TS stack.

The post is mostly arranged chronologically, although I've tried to group some of the related topics together. Hopefully this info proves useful in illustrating approaches for doing an incremental migration in general, as well as some specific techniques you can use.

Table of Contents 🔗︎

Intro
App Overview
Laying the Foundation
- Stabilizing the Codebase
- Database Migrations
Planning the Long-Term Client Migration
Cleaning Up Build Tooling
Server and End-to-End Tests
Migrating Existing Code to TypeScript
TypeScript File Conversion Approach and Techniques
OpenAPI Schema Generation
Migrating the Existing App Features
Final Thoughts

App Overview 🔗︎

We'll call the application that I work on "Project Echo". It's an internal project metrics dashboard tool. Various projects in the company upload various metrics data on a monthly basis. The app processes the data, calculates red/yellow/green statuses for each month's data, and lets management browse the results in various ways to get a sense of how projects are doing.

Architecturally, the app consists of three major sections of code:

Backend: a standard Express app with a REST API
"Business logic core": definitions of internal data types, generic processing of uploaded data, and specific definitions of each metrics category (data field types, extraction of those fields from uploaded data, processing to derive specific calculated results )
Client: a standard SPA, with pages for viewing dashboards summarizing project status and drilling down to details on specific projects

It's pretty simple deployment-wise - no clustering or sharding, just a single instance of the app talking to a single DB. We've got some Ansible scripts that handle ensuring dependencies like Node exist and do the actual app installation steps.

As we pick up the story at the end of the earlier "conversion to Create-React-App" post, the original codebase was approximately 95K lines, which roughly broke down as:

Backend: 26K JS
Business logic: 21K JS
Client: 25K JS, 20K AngularJS HTML templates

However, the app had just undergone a major restructuring of its core data types and corresponding REST endpoints, so about 13K of the backend code and some portion of the client code was already effectively dead and no longer in use at that point. There were also some features that were technically still alive, but hadn't been in active use for a while (and in at least one case the client UI for a feature was completely broken, indicating that no one really cared about using it).

Laying the Foundation 🔗︎

Stabilizing the Codebase 🔗︎

When I took over as team lead, the team had been working on that data type / API refactor for several months, and had gotten bogged down. As I started to understand more about the status of the codebase and the API refactor, I realized that the refactoring effort had left the client in a very buggy state. The use of plain JS + AngularJS HTML templates meant that many field/type renames in the backend hadn't been properly updated on the client, and this was breaking random sections of the UI. It had been months since the last app release to production, and this was also blocking any ability to respond to user feedback. I realized that we needed to get this release out the door, even in a buggy state, so we could start iterating faster.

I'll skip most of the app-specific details here, but to summarize: we managed to get Project Echo 2.0.0 out the door in early June 2020. Once we finally passed that hurdle, we were able to focus on fixing bugs for the next couple months. It felt like a never-ending flood of "this is broken!" reports from users, but we were eventually able to squash most of the major problems and get the codebase mostly stabilized.

Database Migrations 🔗︎

During the prep work for 2.0, I set up a DB migration system under the assumption that it would be useful down the road. This turned out to be some excellent foresight :) I used a tool called umzug, which provides a general-purpose capability to track a list of "migration" tasks of some kind and execute any new migration scripts. It saves the list of executed migration scripts in the DB of your choice, and includes a MongoDB adapter.

I wrote a small driver script that loaded our Mongoose models, initialized the MongoDB connection, and used umzug to load the migration scripts and run any pending migrations. I also added a --revertLast flag that would roll back the most recent migration. In practice this only gets used for local development of a migration script to remove the last "executed script" entry, rather than actually undoing the changes.

The migration system got its first workout during the effort to release 2.0, as it turned out we had to do a lot of work to migrate existing data and convert it to the new internal data structures. Since then, we've used it for a variety of other DB updates, like adding DB indices, fixing invalid data, and adding or changing existing data fields as the app has progressed.

Planning the Long-Term Client Migration 🔗︎

In the previous post I covered how using CRA to build the AngularJS client enabled us to start writing React components using TypeScript as part of the AngularJS codebase. That served as a short-term migration plan, and we successfully built out a few small new features using React+TS embedded in the existing UI. This gave us a few islands of stable code we had confidence in, and gave the rest of the team their first chances to learn and use React+TS.

We could have continued on this path to migrate the rest of the client. I've worked on both "incremental bottom-up" and "rewrite it all from scratch" migrations before, and I knew from experience that it's a lot easier to make progress if you can use the "rewrite from scratch" approach - you can make sure that everything is set up "the right way" from the beginning, and you don't have to spend time interoping between old and new code at each step of the way. However, we also couldn't just set aside the legacy client completely and go spend all of our time building a brand new one - we still had a lot of ongoing improvements we needed to make in response to business needs and user feedback.

We needed a middle ground - a way to work on a new codebase, but without completely giving up on the old one.

Prototyping a New Next.js Client 🔗︎

Most of my prior React apps had used either Create-React-App or a custom Webpack config. I'd heard a lot of good things about Next.js, but hadn't had a chance to try it yet. I knew that we were going to need a multi-page capability in order to support a phased migration. Next's page and routing support looked like it fit the bill.

I created a new ./client folder in our repo and created a new Next project with its own list of dependencies. (I also used Yarn's "resolutions" feature to consolidate some of the dependencies between CRA and Next, in order to cut down on duplications in node_modules.) I then configured our Express server to proxy any requests to /newclient/* over to a corresponding page in the Next server, using http-proxy-middleware:

// Match paths like "/newclient", "/_next", and "/api/next"
const reNextServerRequest = /^\/?(newclient|_next|api\/next))/;
app.use(
  httpProxyMiddleware(
    // Determine whether to forward this request or not
    (pathname) => reNextServerRequest.test(pathname),
    {
      target: 'http://localhost:3000',
      pathRewrite: {
        // Strip off "/newclient" entirely
        '^/?newclient': '',
      },
    }
  )
);

As an initial proof-of-concept of combining the two clients, I added an <iframe src="/newclient/testPage"> in the existing client, added pages/testPage.tsx in the Next project, and verified that I could see the contents of the Next page inside the iframe.

We had some interns on my team in the summer of 2020, and I tasked one of them with investigating the feasibility of integrating a new Next app with our existing client. I asked the intern to research and solve a few different questions:

Can the Next app use the existing server's session cookies to determine who the current user is, in both the Next client and backend API routes?
Can the Next client, running in an iframe, make API queries to the existing Express app's REST API endpoints?
Can the Next client communicate across the iframe to the AngularJS host page, in case we need to synchronize behavior in some way?

Happily, we were able to successfully implement answers to all three questions!

Because the Next client was running in an iframe, and all page requests were being proxied from the Express server, the Next backend already had access to the Passport session cookie. We used next-connect to create a reusable Express-style middleware that could handle processing the session info and loading the right user profile data.

The session ID was encoded with an app secret, so we had to decode that inside the middleware using cookie-signature.

Passport was configured to keep session data in a MongoDB sessions collection. We added a middleware to establish a MongoDB connection and add that to the request as needed. Inside the session middleware, we queried the sessions collection, parsed the stringified session data, pulled out the actual MongoDB user ID, and queried the DB for the actual user object.

This wasn't something we needed right away, but I figured we would need it eventually, so best to solve that problem up front.

Next Client API Queries 🔗︎

This one was easy. The Express app's REST endpoints all live under a URL structure of /api/*. When a Next page is loaded into an iframe in the AngularJS client, it can just request /api/some/data, and the Express server gets first chance to handle it. Any unhandled requests to /api/* were forwarded on to the Next server, and we successfully returned responses to the client from an example API route in pages/api/user.ts.

Cross-Iframe Communication 🔗︎

I'd never done much work with iframes, but I knew that APIs like window.postMessage allow sending text-based messages across frames and tabs. I also knew that there were higher-level libraries that wrapped postMessage to provide a friendlier API.

I settled on using a library called penpal, which lets both the "parent" and "child" sides of an iframe expose specific methods to each other. penpal creates proxy versions of these functions, uses postMessage to serialize the calls, and lets you write standard async function calls in your own code:

import { connectToParent } from 'penpal';

const connection = connectToParent({
  // Methods child is exposing to parent.
  methods: {
    multiply(num1, num2) {
      return num1 * num2;
    },
    divide(num1, num2) {
      // Return a promise if the value being returned requires async processing.
      return new Promise((resolve) => {
        setTimeout(() => {
          resolve(num1 / num2);
        }, 1000);
      });
    },
  },
});

connection.promise.then((parent) => {
  // Call a function exposed by the parent
  parent.add(3, 1).then((total) => console.log(total));
});

We did a simple proof-of-concept where a React component embedded in the existing UI just rendered a list of messages, and a button in the Next client iframe would send a new message across the bridge via penpal.

Configuring the Next Client 🔗︎

Once we'd answered those initial questions, I did some additional configuration work to get the Next client ready for real development. I enabled "absolute imports" by setting "baseUrl": "." in client/tsconfig.json. I also saw that the React DevTools were having issues connecting to the components inside the iframe. After some research, I ended up adding a client/public/js/enableReactDevToolsIframe.js script, and adding it to pages/_app.tsx inside of a Next <Head> tag:

// The React DevTools do not normally look inside of iframes, just the outer window.
// We can enable inspection of React components in an iframe by copying this global variable:
// https://github.com/facebook/react/issues/18945#issuecomment-630421386
// This code must be injected before React runs, so we add it as a separate tag
// and show it via Next.js <Head> in _app.tsx.
if (window.parent) {
  window.__REACT_DEVTOOLS_GLOBAL_HOOK__ =
    window.parent.__REACT_DEVTOOLS_GLOBAL_HOOK__;
}

It looks like the iframe connection issue has since been fixed in the DevTools, but this was very helpful at the time.

We already had a set of environment config files for the backend Express server containing setup for various ports and URLs. I updated next.config.js to import those same config files, and used the serverRuntimeConfig option to expose them into the Next app for future use.

Our deployment environment was configured to use PM2 for managing the Express server process. I added an ecosystem.config.js file that told it to start both the Express server and the Next server together, then updated the build scripts to include the Next client content (including output folders like client/.next/server and client/.next/static). I also updated our Ansible scripts to handle deploying the Next app as well.

Adding Feature Flags 🔗︎

I knew the migration process was going to take a long time, and we needed to ship frequent bugfix releases. I'd heard a lot about "feature flags" and how valuable they are for letting teams build features over time. I decided to add the world's tiniest feature flag implementation to our codebase. I wrote it in plain JS at the time, but later updated it to TS as the migration continued. Here's what it looks like:

export enum AvailableFeatureFlags {
  Feature1 = 'feature1',
  Feature2 = 'feature2',
}

const featureFlags: { [key in AvailableFeatureFlags]: boolean } = {
  feature1: true,
  feature2: false,
};

export function featureEnabled(featureName: AvailableFeatureFlags) {
  let isEnabled = true;
  if (featureName in featureFlags) {
    isEnabled = featureFlags[featureName];
  } else {
    console.error(`Unknown feature name: ${featureName}`);
  }

  return isEnabled;
}

// Usage in an app file:
import { AvailableFeatureFlags, featureEnabled } from '@config/featureFlags';

if (featureEnabled(AvailableFeatureFlags.Feature1)) {
  // do some extra special logic here
}

This is obviously very limited. Flags have to be updated manually prior to builds, there's no dynamic modification after deployment, etc. But, for the relatively small size of our team and application, this was all we needed.

Like the DB migration script, feature flags were a hugely valuable tool for our client migration process. We were able to use the flags all over the codebase (old client, new client, backend) to control delivery of both new features and UI page conversions in production releases, while continuing to steadily merge PRs into our development branch every sprint.

Building Features in the New Client 🔗︎

With the scaffolding for the Next.js client in place and the legacy client stabilized, it was time to start actually building something meaningful. The next major feature on our list was adding a spreadsheet-style UI to let users enter data directly into our application, as an easier option than the existing approach of filling out Excel workbook templates and uploading them.

We started implementing this new "editing UI" feature directly in the new Next client portion of the codebase. I'll skip the details here since it's not related to the actual migration process. We were able to build out the initial MVP implementation in Q3 2020 and successfully ship it to production. This proved that the mixed-client integration approach actually worked and was feasible for real development, including the use of Next pages embedded in <iframe>, data fetching, and multi-process server app deployment.

One decision I made early on in the development of the editing UI feature was that I was not going to add Redux to the app... yet. The whole application really had no meaningful "global state". There was some user auth/profile data loaded, but otherwise every distinct page handled its own data fetching separately, and that data was basically read-only after load. The use cases did not justify adding a Redux store at the time.

However... I had already added Redux Toolkit to the codebase during development of one of the earlier new React+TS features in the legacy UI. The feature relied on a useReducer hook with some moderately complex state updates, and I was writing the code in TS. Redux Toolkit's createSlice API generates reducer functions, and was specifically designed to simplify writing reducers with a good TS usage experience. So, I used Redux Toolkit and createSlice to write reducers just for use with useReducer, even though we didn't have a Redux store in the app!

When we began adding the "editing UI" feature in the new client, I used the same approach. We had a single editingSlice in a useReducer hook in the root component for the feature, and built the whole feature on top of that.

(I'll admit, part of my decision here was sort of an attempt for me to prove to myself that I could take my own advice about when to use Redux - like, "yes, I'm a Redux maintainer, but I don't have to go add it to every app I work on just because". No one on my team would have complained if I did add Redux :) but I honestly didn't think there was enough reason to use a real Redux store at that point.)

Cleaning Up Build Tooling 🔗︎

As the rest of my team pushed forward building the new "editing UI" feature, I started a major effort to clean up and consolidate our application's build tooling and test setup. My goals were removing unnecessary tools and abstractions, speeding up dev environment steps, and making tool configuration more consistent across the codebase.

Removing Gulp 🔗︎

First on the list was removing Gulp from the codebase. We were currently on Gulp v3. I'd previously updated our app from the original Node 6 to Node 10, but apparently Gulp 3.x didn't work on Node 12, so that was a blocker to further Node upgrades.

Gulp was originally being used to drive almost all dev and build steps, including starting the Express app in dev, running server-side tests, running ESLint, and more. I had previously converted a few pieces, including converting the legacy client test suite to use Jest instead of Karma and building the client via CRA instead of Gulp+Babel, but Gulp was still required for several tasks. I wanted it gone.

The app server was being started up in dev with gulp-nodemon. I was able to easily switch it over to just use nodemon directly, by adding a nodemon.json config file with some environment variables and a watch folders list, and changing the run script to be "start": "cross-env NODE_ENV=development nodemon server.js". I did the same thing for ESLint.

This step also required converting the server API test setup to use Jest instead of Mocha. Since I did a lot of other test configuration changes over the next couple months, I'll cover that in its own section below.

Prettier and ESLint 🔗︎

I'd introduced Prettier to the codebase right after I joined the team, as I was doing the build tooling conversion. I initially configured Prettier to only run on TS files, because I didn't want to mess with the formatting of the rest of the codebase yet.

The original ESLint config for the MEAN boilerplate had a horribly restrictive configuration. It used the Airbnb config as its baseline, and had lots of spacing/formatting-related rules turned on by default and set to be errors. I changed all the formatting-related rules to just be warnings instead of errors (especially since we hadn't been running ESLint as part of any CI process, and there were thousands of spacing-related errors being reported).

I originally assumed that I would eventually do a full-blown "format the entire codebase with Prettier" step to make everything consistent. However, as we continued working on the new client, I decided that it was best to leave the existing code alone, and focus on keeping the new TypeScript portions of the codebase correctly formatted. That way, the Git history for the existing code files wouldn't be obscured by a "MARK REFORMATTED ALL THE THINGS" commit, and as we migrated over time the old code would naturally go away anyway.

I did make a number of other updates to the ESLint config, including switching the parser to @typescript-eslint/parser so it could recognize TS syntax and optional chaining, and adding recognition of globals like jest and globalThis.

Module Aliases 🔗︎

The original codebase's imports were a mess. The IIFE-based client files had no way to import any other file, so they all relied on AngularJS dependency injection for use of any other file combined with global variables for many third-party libraries. The backend code was full of relative paths like '../../../some/other/folder/here', as well as a weird pattern that attempted to implement "absolute imports" by assuming the current working directory was the root repo folder and doing require(path.resolve('some/path/from/the/root'))

The codebase really had three major source folders: modules, which contained the backend server and the legacy client; config, which contained the core app scaffolding and all the business logic; and client, with the new Next client. I wanted to simplify imports between files, and also enable importing some of the core files in config into both the legacy client, new client, and backend server, in order to share some utility logic and configuration variables.

My first attempt at this required overriding part of the Next Webpack config in next.config.js, by using next-transpile-modules to do some compilation of files outside of the Next folder, plus a Webpack config rule that ensured files from other folders were processed with babel.

I wanted to have shortcut aliases based on those three folders. For some reason I ended up using '~' for the client folder, and '@config' and '@modules' for the other two. In retrospect I should have been more consistent and gone with '@client', but I didn't and now it's not worth changing that. I had to add a tsconfig.next.json file in the repo root that defined baseUrl: "." plus the shortcuts in paths, then have client/tsconfig.json point back to that.

I redid this a few weeks later, using a neat library called https://github.com/davestewart/alias-hq . It lets you use TS paths as the source of truth for import aliases, and provides functions to convert those into the appropriate format for different tools like Webpack and Jest. It also can be used to override Node's require behavior as well. I used that in the relevant config files for each different portion of the codebase:

// Legacy client: craco.config.js , inside the Webpack overrides
Object.assign(webpackConfig.resolve.alias, hq.get('webpack'));

// Jest tests: jest/jest.client.config.js and jest.server.config.js
module.exports = {
  moduleNameMapper: {
    '^.\\.(css)$': '<rootDir>/jest/jest.empty-module.js',
    ...hq.get('jest'),
  },
};

// Express app: in a core config file that is always loaded by the Node server
require('alias-hq').get('module-alias');

This worked great, and we've been able to consistently use those aliases ever since.

Final Gulp Removal and Cleanup 🔗︎

After swapping out all the remaining Gulp-controlled tasks to run standalone, I was able to finally remove all traces of Gulp from the codebase. That included the Gulpfile, all Gulp-related dependencies, and mentions of gulp in package.json scripts.

I also took this opportunity to remove a bunch of other dead or irrelevant files as well.

Yarn v2 Upgrade 🔗︎

When I originally swapped the build tooling to use CRA, I had replaced the use of NPM 3.x and Bower for package management with Yarn 1.x. This was working great, but I had noticed that the first several minutes of every Jenkins CI job was taken up by installing packages into a fresh clone. I've been interested in Yarn's "Plug 'n Play" feature ever since it was announced, since storing the original some-package.tgz files directly in the repo both eliminates the need to download deps from an NPM registry, and eliminates the time needed to extract them to disk for installation.

Actually switching the repo to try to use Yarn 2 was easy. Yarn 2 (and 3) have you commit a Yarn "binary"-equivalent yarn.js file directly to your repo (usually in ./yarn/releases/yarn.js), and add a .yarnrc.yml with a yarnPath field that points to that.

However, actually using Yarn 2 was much harder than I thought. The biggest issue I ran into was that too many tools still expect files to actually exist in node_modules. Ecosystem compat with Yarn PnP has improved, and Yarn can generate config files for VS Code and other tools that convince them to cooperate better. Where I finally threw up my hands on that process was trying to get the aliases to work with PnP. Yarn has to own all filesystem access for PnP mode to work, and I was getting lost trying to figure out how to get that to cooperate.

Fortunately, Yarn 2 and 3 also ship with a "node_modules linker" that extracts files to disk just like before, and that works great.

I also ran into some corporate-environment-specific issues, as the way Yarn starts up external processes for postInstall scripts was getting flagged by internal process-monitoring security software. I have to give kudos to the Yarn devs here. I asked for help in their Discord, and they provided excellent support for my situation. At one point they were trying out several alternate approaches for starting those processes and generating customized PR builds just for me to try out, even though my situation was clearly an edge case and there was no way to replicate that environment externally.

The upgrade process turned up a number of places in our codebase where we were implicitly relying on dependencies that we hadn't specifically declared in package.json, so I was able to add those.

I did eventually merge in the Yarn 2 upgrade PR, configured to use nodeLinker: node_modules, our internal NexusRepository instance as the NPm registry, and some other settings for SSL certs, timeouts, and warnings. It's been working great ever since, and that experience helped convince me to migrate most of the Redux org repos to Yarn 2 (and later 3) as well.

Server and End-to-End Tests 🔗︎

The original MEAN boilerplate was configured to write "integration"-style server API tests using supertest to interact with the Express app in memory. The previous team lead had done a great job of continuing to write extensive server tests for the REST API behavior. This was very much a good thing - it allowed us to realistically exercise the backend API the same way the real app would. The original setup used Mocha for running tests, driven by Gulp.

I'd already converted the AngularJS client tests over to use Jest instead of Karma. I wanted to consolidate the backend tests to use Jest as well, so that the whole codebase was using the same test runner.

There was also a mostly-broken E2E test setup using the Robot Framework for Python. I'd researched and prototyped use of Cypress on a previous project and really wanted to use that instead, but hadn't had a chance to try setting it up for real yet.

Jest Server Test Configuration 🔗︎

The server API tests were all located in paths like modules/articles/tests/server/admin.article.server.routes.tests.js. I set up a jest/jest.server.config.js file to process those:

const path = require('path');
const testConfig = require('../config/assets/test');

module.exports = {
  rootDir: '..',
  moduleNameMapper: {
    // Support some path aliases starting from the root of the repo
    '@/(.*)$': '<rootDir>/$1',
    'modules(.*)$': '<rootDir>/modules/$1',
    // Treat CSS imports as empty JS objects
    '^.+\\.(css)$': '<rootDir>/jest/jest.empty-module.js',
  },
  moduleDirectories: ['modules', 'node_modules'],
  modulePathIgnorePatterns: ['<rootDir>/dist'],
  testEnvironment: '<rootDir>/jest/serverTestEnvironment.js',
  testMatch: testConfig.tests.server.map((path) => '<rootDir>/' + path),
  setupFilesAfterEnv: ['<rootDir>/jest/jest.server.setup.js'],
  transform: {
    '^.+\\.css$': '<rootDir>/jest/cssTransform.js',
    '^.+\\.html$': '<rootDir>/jest/jest.transform.html.js',
  },
  transformIgnorePatterns: [
    // we want to ignore everything in node_modules,
    // except the HTML templates inside angular-ui-bootstrap
    '/node_modules/(?!angular-ui-bootstrap.+\\.html)',
  ],
  coverageDirectory: 'coverage/server',
  coverageReporters: ['lcov', 'cobertura', 'text'],
};

The test setup file loaded all of our Mongoose models, and added a beforeAll() section that created a Mongoose connection to the DB and an afterAll() that closed the connection.

My initial serverTestEnvironment.js file extended jest-environment-node with some custom behavior that slugified the test filename to create a unique DB entry for each test file, under the theory it would let me run tests in parallel. That turned out to have issues, and I later switched to running jest --runInBand to run test files in serial.

Much later, after several further config changes, I tried running server tests in parallel again and this time was able to get it to work, thanks to some of the changes I'll describe next.

Cleaning Up Server Test Code 🔗︎

The original server API tests were written in a very verbose and painful style, due using to nested callbacks for AJAX requests and should for assertions. This test from the original MEAN boilerplate is a perfect(-ly ugly and horrible) example:

it('should be able to save an article if logged in', function (done) {
  agent
    .post('/api/auth/signin')
    .send(credentials)
    .expect(200)
    .end(function (signinErr, signinRes) {
      // Handle signin error
      if (signinErr) {
        return done(signinErr);
      }

      // Get the userId
      var userId = user.id;

      // Save a new article
      agent
        .post('/api/articles')
        .send(article)
        .expect(200)
        .end(function (articleSaveErr, articleSaveRes) {
          // Handle article save error
          if (articleSaveErr) {
            return done(articleSaveErr);
          }

          // Get a list of articles
          agent
            .get('/api/articles')
            .end(function (articlesGetErr, articlesGetRes) {
              // Handle article save error
              if (articlesGetErr) {
                return done(articlesGetErr);
              }

              // Get articles list
              var articles = articlesGetRes.body;

              // Set assertions
              articles[0].user._id.should.equal(userId);
              articles[0].title.should.match('Article Title');

              // Call the assertion callback
              done();
            });
        });
    });
});

Over the next few months I converted a lot of the should assertions to use Jest's expect(). I also wrote test utilities to abstract basic test operations like "sign in", "create a default test user and other items", and "fetch/post to a URL and return the body". I spent hours converting many of the tests to use async/await syntax instead. Afterwards, that test might look like:

let user, item

  beforeEach(async function() {
    ({
      user,
      item
    }) = await createTestUserAndItem();

    await signin(agent);
  })

  it('should be able to save an article if logged in', async function() {
      await postBody(agent, '/api/articles', article);
      const articles = await fetchBody(agent, '/api/articles');
      expect(articles[0]).toMatchObject({
        user: {
          _id: user.id
        }
        title: 'Article Title'
      })
  })

Some of the real server tests in our codebase had up to 20 levels of nesting - really the very definition of "callback hell". Flattening those to use async/await was tedious, but also rather satisfying :)

Initial Cypress Setup 🔗︎

Over the summer I'd had an intern work on setting up Cypress for our codebase. I dug up the research notes and prototyping I'd done on a previous project and handed those over as a starting point. The intern was able to use those to get Cypress installed and running in our local development environments, and create a few initial tests that exercised some basic functionality and figured out how to do file uploads.

We weren't running any of these tests automatically, but at least having the basic setup available was a good start.

Jest, MongoDB, and Jenkins 🔗︎

I originally added the "unique slugified DB name per test file" config because we were running these tests locally, and I didn't want to have the tests wipe out the existing DB contents we were using for development. However, I also really wanted to start running the server tests in our Jenkins CI jobs, and that was going to require running a MongoDB server instance somehow. Additionally, I wanted to keep the test DB scaffolding separate from whatever instance of MongoDB we were running in our local dev environments.

After some research, I found https://github.com/nodkz/mongodb-memory-server . It's a neat package that automatically spins up an instance of the mongod binary in the background on demand, and provides the connection URL so you can connect to that in your tests.

However, using mongodb-memory-server was difficult for us. mms wants to download an entire MongoDB distribution file the first time it runs on a machine so it can extract the right mongod binary for the current platform. Our Jenkins servers don't have access to the outside internet, and our local dev machines have to go through a web proxy. Also, those MongoDB distribution archives were typically 200-300MB, and I didn't want to have to wait for a file that big to download every time a CI job ran.

I dug through the mms source code and figured out the URL structure that it was looking for to find the MongoDB archives from a public server. I also figured out where it normally caches the binary file, in ./node_modules/cache/mongodb-memory-server/mongodb-binaries/$MONGO_VERSION/.

My team uses Windows for development, but the Jenkins agents are RHEL7, so I needed something that could work in both OSes. I ended up downloading the Windows and Linux MongoDB distribution archives for the specific version of MongoDB that we're using, extracted just the mongod binary and any shared libs it needed, and manually reconstructed new zip files for each platform with just mongod inside. I then uploaded those to our internal NexusRepository binaries archive instance.

To use those, I wrote a central config file that defined the NexusRepo base URL + the specific subfolder where I uploaded the archives:

const NEXUS_REPO_MONGO_BASE =
  'https://nexus-repository.ourcompany.com/repository/our-data/mongodb';

export function getMongodbMemoryOptions() {
  process.env.MONGOMS_DOWNLOAD_MIRROR = NEXUS_REPO_MONGO_BASE;

  return {
    binary: {
      version: OUR_MONGO_VERSION,
      skipMD5: true,
    },
    instance: {},
    autoStart: false,
  };
}

I then modified jest/jest.server.globalSetup.js to use mms to establish the initial MongoDB connection, set a process.env.MONG_URL global to expose that URL to the per-test-file setup logic, and also write the MongoDB connection URL to a temp file. I used that info to also do cleanup in jest.server.globalTeardown.js.

Later, I modified serverTestEnvironment.js to do some very hacky checks to figure out if a given test file even needs a MongoDB connection. I used context.testPath to read the file's source off disk and do a regex search to see if the word mongoose appears anywhere in the source, and set a this.global.NEEDS_MONGO = reNeedsMongo.test(fileSource). That way I could skip a bit of the setup overhead for other test files.

Once the mms setup was working, I added a small downloadMongoMemoryBinary.js script that called MongoBinary.getPath(getMongoMemoryOptions()) so that I could force mms to pre-download and extract that binary as needed, and added it as a setup step in our Jenkinsfile for the tests. This let me run the server API tests in Jenkins, alongside the unit tests for both the legacy and new clients and a lint check. Having those actually run in CI gave us a lot more confidence that we weren't breaking things as we continued refactoring.

Parallelizing Server Tests with MongoDB 🔗︎

I eventually revisited the idea of unique sub-DBs per test file to allow parallelization, and ended up with this in jest.server.setup.js:

if (global.NEEDS_MONGO === true) {
  loadModels();

  beforeAll(async function () {
    if (mongoose.connection.readyState === 0) {
      // Create a unique DB name for this file's run by taking the original test filename
      // and replacing all periods with dashes. Max DB name is 64 characters.
      const mongoURL = new URL(process.env.MONGO_URL);
      const testFilename = path.basename(global.testPath);
      const dasherizedPath = testFilename.replace(/\./g, '-').slice(0, 63);
      mongoURL.pathname = `/${dasherizedPath}`;
      const fileUniqueDatabaseURL = mongoURL.toString();

      await mongoose.connect(fileUniqueDatabaseURL, {
        keepAlive: true,
        reconnectTries: Number.MAX_VALUE,
        useMongoClient: true,
      });
    }
  });

  afterAll(async function () {
    try {
      await mongoose.connection.dropDatabase();
      await mongoose.connection.close();
    } catch (err) {
      // Ignore errors
    }
  });
}

That actually let me run all the server API tests in parallel without any of the different test files stomping on each other in the DB, giving us a noticeable speedup.

Cypress and MongoDB 🔗︎

I wanted to have a repeatable test dataset for use in our Cypress tests, and I also didn't want to have Cypress interfering with the local dev Mongo instance either. I added a script that could dump a subset of our DB contents for use in Cypress tests. It did some analysis of our DB contents to determine what items to dump, then shelled out to mongodump with a query that matched those exact items per collection. The final dump call is implemented as:

function dumpCollectionData(
  mongodumpPath,
  outputPath,
  databaseName,
  collectionName,
  query
) {
  fs.ensureDirSync(outputPath);
  const stringifiedQuery = JSON.stringify(query);

  const commandArgs = [
    '--db',
    databaseName,
    '--collection',
    collectionName,
    '--query',
    `${stringifiedQuery}`,
    '--out',
    outputPath,
  ];

  crossSpawn.sync(mongodumpPath, commandArgs, { stdio: 'inherit' });
}

I updated the main app's MongoDB config setup to check for a process.env.E2E variable to determine if it should spin up mms instead of making a connection to the local dev Mongo instance, so that I could run the app servers in a production-like scenario but on my own local machine.

On the Cypress side, I updated cypress/plugins/index.js to import our standard MongoDB connection setup config, and added a seedDatabase command that reads the MongoDB URL from that JSON file that the mms setup writes to disk. It then clears the DB, finds the most recent dumped DB dataset, and uses https://github.com/cdxOo/mongodb-restore to load those dumped files programmatically. (I did this because I couldn't guarantee that mongorestore would be available on the machine where Cypress was running.)

Running Cypress in Jenkins 🔗︎

Our Cypress tests stayed local/manual-only for over a year, which also meant that they really weren't helping us much. In mid-2021 I finally tackled trying to get them to run in Jenkins as well.

The big complication was that they required a full copy of the built app running: Express server, Next server, and MongoDB instance. My first thought was that we'd probably need to build some Docker images containing an install of MongoDB + the job build artifacts, and have Jenkins spin those up somehow. I had another summer 2020 intern look into that, and they did make some progress in coming up with a Dockerfile.

However, later on I rethought that based on what I'd done with Jest + mms. The Jenkins jobs were building all the artifacts we needed already, the Jest server tests were running based on mms, and I'd also configured Cypress to use mms as well. What if I just ran the entire Project Echo server process setup right there in Jenkins, and configured Cypress to run in Jenkins as well?

I wrote a small node script that used the concurrently JS API to start up both processes with the right environment variables ( env: {NODE_ENV: 'test', E2E: 'true'} ). I updated package.json with a run script to kick that off and start Cypress when ready, using https://github.com/bahmutov/start-server-and-test :

{
  "scripts": {
    "test:e2e:startServer": "node scripts/runServersE2E.js",
    "test:e2e:cypress": "yarn cypress run",
    "test:e2e:run": "start-server-and-test test:e2e:startServer 3001 test:e2e:cypress"
  }
}

On the Jenkins side, I uploaded a specific Cypress binary zip to our NexusRepo instance to make it accessible to our jobs. I updated our nightly "code quality" Jenkinsfile with some environment variables that checked to see if a ./cypress-cache folder existed in the job workspace, and if not, download and install the Cypress binary into that folder.

I then added a Jenkinsfile stage that executes those commands and checks the results, without failing the job, and also attaches the HTML report file I generated from Cypress using https://github.com/LironEr/cypress-mochawesome-reporter :

stage('Integration Test') {
  steps {
    // Remove existing code coverage output folders to ensure a clean run
    dir('.nyc_output') {
      deleteDir()
    }
    dir('coverage/cypress') {
      deleteDir()
    }

    // Print some useful info for reference
    sh 'yarn cypress info'

    script {
      // Run Cypress, but capture the exit code without failing the entire job
      // Cypress returns a non-zero exit code if 1+ tests failed
      rc = sh(script: 'yarn test:e2e:run', returnStatus: true)

      if (rc 1= 0) {
        // Email team leads on test failures, with attached HTMl report file
        emailext subject "Cypress Tests ${BUILD_NUMBER} Failed", to: "team.lead@ourcompany.com", attachmentsPattern: "cypress/reports/html/index.html", body: "Project Echo Cypress tests failed: ${RUN_DISPLAY_URL}\n\nSee the attached test results in `index.html` for more details"
      }
    }
  }
}

This worked great, and our Cypress tests have been running on a nightly basis ever since.

Migrating Existing Code to TypeScript 🔗︎

We'd been writing almost all new code in the codebase as TypeScript ever since I did the initial conversion of the legacy client build tooling to use CRA, and all new code in the Next app was written in TS. However, the majority of the codebase was still plain JS: the Express server endpoints, the core business logic, and all of the legacy AngularJS client.

It was time to tackle converting much of that code to TS. The server endpoints and business logic were going to be part of the codebase going forward, so we really needed to get as much of that converted to TS as possible. While the legacy AngularJS code was hopefully going to go away eventually, I knew the migration process would take a long time. So, any TS conversion I could do there would still have benefits as long as that code was in use, especially since some of the client logic had been incredibly prone to bugs and hard to decipher.

Configuring TypeScript for the Backend Server 🔗︎

The first step was to enable use of TS for the Express app and business logic core. I already had two tsconfig files, one for the legacy client/CRA, and one for the Next client. I added a tsconfig.server.json with appropriate compiler options and include settings to match all of our backend code folders.

nodemon was set up to run and reload the Express app in local dev. I updated it to run ts-node --pretty --log-error --project tsconfig.server.json server.js, and changed the watch settings to capture any .ts files as well.

Since some of our files were now TS, just running any of our migration or utility scripts under Node would fail when Node tried to actually import a TS file. I added a "run-ts": "ts-node --project tsconfig.server.json" package script so that we could reuse that in other script commands, and run migrations and other scripts safely.

For production builds of the server, I just ran tsc and configured it to output all the .js files side-by-side with their original .ts source files. Since that generated a bunch of build artifacts that couldn't easily be ignored, I also wrote a cleanServerBuildOutput.ts script that globs for all server-side TS files, checks each file path to see if there's a side-by-side .js file, and if so removes it.

I also had to update our Jest server test config to use ts-jest and point it to the right tsconfig.server.json.

I tested this all out by migrated a couple small util files to TS, and confirming that the code ran in local dev, tests, and a staging server environment.

Converting the Backend Business Logic to TypeScript 🔗︎

With all that in place, I finally dove into converting our core business logic to TS. In addition to actually converting the code itself from JS to TS, I also wanted to accomplish several other goals: finally understand what most of that code was actually doing and how it worked, consolidate and clean up the code, and come up with good TS types for our core data structures.

The first step was to start defining types for our data structures. I pored through dozens of JS code files to trace declarations and usage, and slowly accumulated a core appTypes.ts file that covered actual business objects, backend configuration definitions, enums, utilities, and everything else that I ran across. I continued adding to this file and updating it over the course of the conversion process. That gave me a consistent place to look and reference types as I went along.

From there I began recursing through the business logic folder structure. I started at the root logic module file, renamed it to .ts, and started adding a bunch of // @ts-ignore on its imports. I then worked my way down through each of the logic feature subfolders, manually renaming files one at a time, and tracing through each file to figure out types for variables.

I did briefly play around with a couple of automated TS migration tools to see how well they worked, but concluded the results weren't sufficiently useful in our case. If I was going to have to go through and clean things up anyway, it was best if I just did it all by hand to make sure I knew things were correct.

I'll again skip the app-specific aspects of the cleanup, because it's proprietary and not relevant to the "migration" aspect. But, I can say that when I started, that section of the code was about 20K lines, and the code for each of the 30-ish sub-features was split across a dozen different files. By the time I was done, that was consolidated down into a single file per sub-feature and about 17K lines total, including the hundreds of lines of TS typedefs I'd added. So, a lot less code to read, and a lot easier to understand and follow the actual code.

The biggest aid in this entire process was actually our existing server API test suite. Those tests already gave us about 95% coverage on the backend business logic code. This was absolutely invaluable in making sure that my code cleanup efforts didn't break anything... and boy, did I break a lot of things :) I frequently would make what seemed like a perfectly reasonable refactor step that ought to preserve the existing behavior, re-run the tests, and see a half-dozen errors spew onto the screen telling me that "nope, you done screwed up!" Having that safety net helped me avoid what would have been dozens of breakages.

I will note that I did spent some extensive amounts of time defining certain core types during this process, including some reusable base types that get "inherited" by various features, and a couple particular very complex type definitions needed by some of the configuration setup. The latter took me several days to get right, but it paid off later by inferring many function arguments and field types from a single supplied generic type. This was one of the rare occasions where it was worth spending more time to write a complex typedef, to help ensure the type safety of the core logic. It also actually simplified the usage in dozens of files, so it was a net win.

Converting Express Endpoints to TypeScript 🔗︎

I didn't want to tackle converting all of our app endpoints to TS at the time, but I did want to at least get a few converted so we had examples of what that looked like. This process was actually fairly straightforward.

The main types needed are import type { Request, Response } from 'express', and use those in your Express handlers like async (req: Request, res: Response) => {}.

In our case, this was made much more complicated because the application was treating req as a pseudo-global variable. Any part of the endpoint processing logic and much of the business logic could (and did) read and write various fields on req at any point in time. I spent wayyyyy too long tracing through most of the backend logic to assemble a list of all the req.whatever fields that were getting used, and where.

I ended up defining a single combined type that represented all the fields that could exist on req at any time:

import type { Request } from 'express';

export interface ExtraRequestData {
  user: User;
  data?: BasicObject;
  now: Date;
  businessObject1?: BusinessObject1;
  businessObject2?: BusinessObject2;
}

export type AppRequest = Request & ExtraRequestData;

and then use that as async (req: AppRequest, res: Request) => {}. It's not perfect, but it at least captures most of the ways we use req. (I'm looking forward to doing a major refactor of the backend logic in 2022, and hopefully we can completely get rid of all code that mutates req.)

Converting AngularJS Client Controllers to TypeScript 🔗︎

Similarly, I didn't want to spend a lot of time migrating all the legacy AngularJS code to TS. However, there were a few key controller files that A) were the core of the logic on the client, B) were long and complicated, C) relied on fetching and reading the core data types from the server, D) had been frequently edited over time, and E) were also very prone to bugs and hard to read. I wanted to convert those key controllers to TS to give us some more confidence in that code as long as it was still in use.

The first issue was the module file formats. All of our AngularJS client files were originally written as IIFE-encapsulated files, and relied on a mixture of AngularJS dependency injection and global variables. Fortunately, since those were now being loaded with CRA, I could just delete the IIFE wrappers and start adding import statements to treat them as ES modules.

AngularJS has decent TS types, as did several of the related libraries we were using like angular-ui-notification. Within each controller file, I typed all of the injected variables as best as I could. I also defined an interface representing the "controller instance" object that was returned, to help clarify what it contained and make sure I wasn't missing anything. Along the way I had to convert a lot of client-side data manipulation logic to TS as well.

This looked roughly like:

import angular, * as ng from 'angular';

import { $FixTypeLater, BusinessType1, BusinessType2 } from '@config/appTypes';

angular.module('our-app').factory('Feature1Factory', Feature1Factory);

Feature1Factory.$inject = [
  '$window',
  '$rootScope',
  '$state',
  'Notification',
  'Authentication',
  'SomeDataService',
];

function Feature1Factory(
  $window: Window,
  $rootScope: ng.IScope,
  $state: UIRouterService,
  Notification: ng.uiNotification.INotificationService,
  Authentication: AuthenticationService,
  DataService: SomeDataService
) {
  interface Feature1Controller {
    a: string;
    b: number;
    fetchStuff: (arg: string) => Promise<Items[]>;
    setSomething: (value: number) => void;
  }

  const controller: Feature1Controller = {
    a: 'stuff',
    b: 42,
    fetchStuff,
    setSomething,
  };

  async function fetchStuff(arg: string) {}

  function setSomething(value: number) {}

  return controller;
}

This effort really paid off as we continued the actual client migration process, because it was a lot easier to see what the original code was doing. (This also stopped a lot of the bugs we'd been seeing in the legacy codebase as well.)

TypeScript File Conversion Approach and Techniques 🔗︎

Awright, here's the part a lot of people will probably be interested in: how do you approach converting a legacy JS file to TS?

In this case, there is no silver bullet. You have to take your cue from Nike or Shia LaBeouf - "just do it!". I can provide a general approach that I used, and some specific techniques that I found valuable in the process.

My Philosophy of TS Conversion 🔗︎

When I wrote about Learning and Using TypeScript as an App Dev and Library Maintainer, one of my most important takeaways was that "pragmatism is vital". I know there's many TS users who will spend hours trying to craft "perfect 100% correct and type-safe" types, and would tell you that "if you have a single use of any in your codebase your code isn't type-safe and could burst into flames at any minute" (only slightly exaggerated or paraphrased here).

As you can tell, I disagree with that stance :)

As I said at the time:

My goal is the "80% sweet spot" of type coverage. I'd like types for our API calls, Redux state and logic, and React components. As long as those are basically in place, I'd say we're in fairly good shape, especially given that our team is new to TS. If there's a bit of code that is particularly difficult to type, I'm entirely happy to slap in a type FixTypeLater = any, and move on with my life instead of banging my head against a wall for several hours.

Note that I'm not saying I'm against having 100% type coverage. It's just that, like unit test coverage, getting that last 10-20% becomes much much harder, and it doesn't seem like the benefits are necessarily worth the amount of time you have to put in to make it happen.

During this initial TS conversion phase, it really is best to get reasonably good types in place. Give it your best shot, but feel free to leave // TODO comments, use // @ts-ignore as needed, and keep on going. I'm fine with having some of the types be placeholders or "loose" types, and keep the TS configuration itself strict, rather than loosening the TS configuration and causing other problems inadvertantly.

Having stated all that, it's ironic that I have spent a large portion of my Redux maintainer effort this year working on some insanely complex TS types for Redux Toolkit, Reselect, and React-Redux :) But, an app being converted from JS to TS is a very different scenario from building a library. Libraries must have good TS types to cover all the ways they're going to be used, and other people depend on those types. For the app, we're the only ones who need the types, and the complexity of any given bit of code is much smaller, so the types are simpler.

TypeScript Settings 🔗︎

To start, make sure that you have TS configured with "strict": true and "noImplicitAny": true settings! The goal is to catch any places where TS can't correctly infer what a variable type is, as well as common errors like null/undefined coercion. You will probably also want to turn on "esModuleInterop": true to better handle cross-module-format imports.

File Conversion Process 🔗︎

The standard approach is:

Rename the file from .js to .ts. Commit just the rename by itself, with no further changes. This helps ensure that Git recognizes that what you did was a rename, so it can continue to track the history of the file before and after the conversion.
The file may not have been auto-formatted previously, because I had Prettier configured to only format TS files. Press spacebar somewhere in the file, and hit CTRL-S to save. We have VS Code configured to auto-format with Prettier on save, so that kicks in and properly formats the file. Commit just the reformat changes by themselves.
Update all legacy require() imports to use ES module import/export syntax. (If by chance you're dealing with AMD files, same applies for the define(['a', 'b'], function(a, b) {}) header.) You'll need to carefully look at the other files being imported and review what they actually export to make sure you're doing the right named/default imports in the new TS file. (More on this later.) It's also likely that many of those imported files are still plain JS, and TS will complain because it doesn't know what type they are. Slap a // @ts-ignore on each of them and come back to these later.
At this point the file probably has several dozen red error squigglies. Work your way down the file top to bottom, adding types to all function arguments and as many variable declarations as possible. Many of these are "implicit use of any" in function parameters, or declarations of empty arrays. Eyeball the code, inspect the variable usages, and figure out the best types that you can. If you run across a type that's used in more than one place, extract it to a reusable type with a good name and consistently use that type.

Note that the goal is not to get perfect types in place right away! As covered in the next section, I frequently had to fill in placeholders or types that just said "yep, it's an object/array/whatever", move on to other sections, and come back later. Getting some types in place, even if they're imperfect, is a huge step in the right direction.

Specific TS Conversion Tips and Techniques 🔗︎

Right at the beginning, I defined a few placeholder types:

type $FixTypeLater = any;

type $FixTypeLaterFunction = Function;

type BasicObject = { [key: string]: any };

I used these liberally during the initial TS conversion process. To be clear, these are not good types :) But they were critical in helping me get enough of the code typed to build cleanly. $FixTypeLater lets me distinguish between "this is an intentional use of any" vs an "I just don't know what this type is yet". BasicObject is useful to at least distinguish that "this type is an object" rather than a primitive or an array, and also works for places where you're dealing with an unknown blob of data being passed through the system.

Sometimes TS can figure out what a type is automatically. For example, "a,b,c".split(",") is going to return a string[]. Other times you will have to inspect the surrounding code to look for clues from the usage.

One common problem is code that initializes an empty array, like const items = [], and then calls items.push(newItem) later. Without a type declaration, TS treats that array as an unknown[], and won't let you use anything from that. TS knows it's an array of some kind, but it doesn't know what the item type is. Update all array declarations to include a type, like const items: Item[] = [].

TS Types for Libraries 🔗︎

In addition to dealing with your own code, you'll run into frequent cases where TS does not recognize an import for a JS library that you're using. Overall, there's several possible scenarios for JS libraries and TS types:

The library was written in TS, and thus includes the typedefs compiled from the original source. This Just Works, and the types will be correct because they came straight from the actual source.
The library was written in JS, but includes its own hand-written typedefs. This mostly Just Works in that TS can find the types automatically, but because the typedefs were written by hand they may not be perfect.
The library was written in JS and does not include its own types, but the community has written its own typedefs in the public DefinitelyTyped repository and those are published as @types/some-package. You'll have to install @types/some-package yourself and add it to devDependencies. Once installed TS should recognize the types. Again, because these are written separately from the original source, there's even less guarantee they're correct.
The library was written in JS and has no types available. At this point, you have three sub-options:
- Add a // @ts-ignore over every import and let TS treat the imports as any
- Add a separate typedefs file in your codebase with declare module "some-package", and let TS treat it as any
- Add a separate typedefs file in your codebase, insert declare module "some-package" { /* write types here */ }, and actually start writing your own typedefs for that other library by hand

Fortunately the majority of the JS ecosystem is now either written in TS, includes types, or at least has decent typedefs available in DefinitelyTyped, so it's less common to have to resort to typing a third-party library yourself.

You can check the types status of a library in several ways:

The TS home page has a types search at https://www.typescriptlang.org/dt/search
NPM package search shows a "TS" icon if a package includes typedefs
You can browse a package's published contents via Unpkg, like https://unpkg.com/redux/ and look inside
You can always check the actual Github repo

Migrating CommonJS Modules to ES Modules 🔗︎

I've found that adding TS types isn't even the hardest part of the conversion process. Instead, what's really hard is dealing with CommonJS/ES Module file format interop, especially when you've got CJS files importing ESM and vice versa. As a brief recap:

CommonJS files use module.exports = someValue or module.exports.someField = someValue for exports, and const exported = require('./some-file') for imports. That means there's a couple ways to defined "named exports" - either by listing module.exports.someField explicitly, or making the exported value an object like module.exports = {someField}. You may also see const { someField } = require('./some-file') to only grab specific fields on the importing side.
ES Module files use import/export syntax instead. It's important to understand the difference between "default" and "named" exports and how they normally work with ESM files by themselves.

I'm going to ignore the whole "Node.js doesn't yet have full support for ES Modules" and "publishing NPM packages with support for CJS and ESM formats" topics, because those are their own entirely separate problems. For this discussion, assume we're dealing with a codebase that runs and compiles with mixed JS/TS/CJS/ESM files, and you're working on converting a few files from JS+CJS to TS+ESM at a time.

If you have "esModuleInterop": true enabled in your tsconfig.json, a decent portion of cross-file import interop will get handled for you, but you still have to do some work to make sure things line up okay on both sides.

When importing a CJS file into an ESM file, you can sort of equate module.exports.someField = someValue or module.exports = {someField} to ESM named exports. If you try import { someField } from './some-cjs-file', I think it works most of the time. Default exports should be good - import defaultValue from './some-cjs-file'.

Going the other way and importing ESM into CJS, it's tricky. I've seen cases where export someField didn't seem to show up correctly on the CJS side. As a fallback, I've resorted to adding export default { someField } at the end of some TS files, so that the default export is an object containing the same fields that were named-exported earlier.

OpenAPI Schema Generation 🔗︎

Our Express app's REST API was primarily intended for use by the web client, but we have a few scripts that pull data from the server, and a handful of other users that extract data as well. The previous team lead had partially documented some of the homegrown filtering and sorting query options, but none of the actual data types themselves were documented. One of the users requested that we try to document what the various API endpoints actually accepted and returned.

I didn't want to try to write any schema documentation by hand. There is a very neat tool called Stoplight Studio that helps you write OpenAPI schemas in a GUI, but in this case I needed something that would continue to stay up to date automatically as we made changes to the system.

I ended up writing a custom script that generated an OpenAPI schema as part of our staging/release build step. I used several libraries to do this. Since I was writing the initial version of the script before I'd done any TS conversion work, all I had available for type definitions was our Mongoose models. I found a library called mongoose-to-swagger that accepted Mongoose models and output JSON-Schema descriptions. I added some custom post-processing of the generated entries to fill in additional fields and descriptions.

For viewing the schemas, I added static copies of two different OpenAPI viewer tools: redoc and swagger-ui. Both offer standalone HTML+JS formats that can be copied and pasted directly into a repo and served as static HTML, with a little bit of JS configuration added to point them to fetching an OpenAPI file from the server. Redoc has a much nicer look and feel for the formatted data, while Swagger-UI offers interactive endpoint playgrounds in the page.

In order to get the data structures displayed in Redoc, I had to add "tags" to each entry with an 'x-displayName' field and a description: '<SchemaDefinition>' field, per discussion in the Redoc issues.

Later, after the TS conversion was complete, I reworked the script to pull most of its data from the TS types directly. I used ts-json-schema-generator to parse all of the core business logic files with their defined types and openapi3-ts to help build the actual OpenAPI schema. I also used express-list-endpoints to pull a list of all the endpoint definitions out of an instance of our Express app so I could add those to the schema, and did some more mashing together of the endpoints and the data types to fill out the request and response contents correctly.

Migrating the Existing App Features 🔗︎

By spring 2021 we had completed building out that "editing UI" feature. We'd fully proven out the Next client interop, built up a good foundation in the new codebase, and we were finally ready to start migrating actual existing app functionality over into the new client.

Adding Redux 🔗︎

We still were only using a single useReducer hook for the "editing UI" feature's state, albeit with an RTK createSlice-powered reducer. At the same time I started prototyping migrating some of our dashboards into the Next app, the Redux team (ie, Lenz Weber, Matt Sutkowski, and myself) had published the initial alpha versions of our new RTK Query data fetching and caching addon for Redux Toolkit. Given that the app's dashboards did a bunch of data fetching, and that I wanted to try out RTKQ for myself and dogfood it, I decided it was finally time to add a real Redux store to the codebase. Also, the "editing UI" feature had grown complex enough that it really would benefit from visualizing the state in the Redux DevTools.

This was the first time I'd ever tried setting up a Redux store in a Next app. This is more complicated than a typical CRA setup. A CRA app only has a single "page", but a Next app has many pages plus a common pages/_app.tsx component. Also, it's very common to want to do server-side rendering in Next, where you fetch data on the server to do the initial page render and then need to rehydrate the Redux store on the client with that same data. This often requires adding a library like next-redux-wrapper to manage the rehydration process.

Since we were only using Next for pure client-side rendering at the time, I decided to bypass that part for now. I initially set up each separate "page" component with its own Redux store. This was also made trickier because Next will render your page to static HTML on the server, so your store setup has to handle being run on both the client and the server. I ended up swiping a useStore hook pattern I saw in some of the Redux examples in the Next repo. Also, since we need to create unique store instances for each unit test, I ended up making an initStore() factory function. This got kinda fiddly:

import { configureStore, combineReducers } from '@reduxjs/toolkit'
import { useMemo } from 'react';

import counterReducer from '~/features/counter/counterSlice';
import { apiSlice } from '~/features/api/apiSlice';

export const rootReducer = combineReducers({
  api: apiSlice.reducer
  counter: counterReducer,
});

export type RootState = ReturnType<typeof rootReducer>

export function initStore(preloadedState: Partial<RootState> = {}) {
  return configureStore({
    reducer: rootReducer,
    preloadedState,
    middleware: gDM => gDM().concat(apiSlice.middleware)
  })
}

export type AppStore = ReturnType<typeof initStore>

let ExistingStore: AppStore | undefined

export type AppDispatch = AppStore['dispatch']
export type AppThunk<ReturnValue = void, State = RootState> = (
  dispatch: AppDispatch,
  getState: () => State
) => ReturnValue;

export const initializeStore = (preloadedState?: any) => {
  let currentStore = existingStore ?? initStore(preloadedState);

  // After navigating to a page with an initial Redux state, merge that state
  // with the current state in the store, and create a new store
  if (preloadedState && existingStore) {
    currentStore = initStore({
      ...preloadedStore.getState(),
      ...preloadedState
    })
    existingStore = undefined
  }

  // For SSG and SSR always create a new store
  if (typeof window === 'undefined') return currentStore;
  // Create the store once in the client
  if (!existingStore) existingStore = currentStore;

  return currentStore;
}

export function useStore(initialState?: any) {
  return useMemo(() => initializeStore(initialState), [initialState])
}

This served us well as we migrated the first few pages over. I had to copy-paste that file a couple times, as different pages needed different reducers. This also complicated some test setup and test helper functions due to duplicate store definitions as well.

Later on, I decided that it would simplify things if we switched to having a single Redux store instance shared across all pages of the app, so I switched us over to a single copy of the store setup file loaded once.

Thus far we still haven't used any Redux-based SSR or rehydration, so we still aren't using next-redux-wrapper.

Building Features with Redux Toolkit and RTK Query 🔗︎

I'll skip the app-specific details again, but suffice it to say that actually using RTK Query in an app has been a joy. The legacy AngularJS app had a bunch of $resource subclasses to encapsulate API endpoints. Those worked, but were annoying to deal with. Related to that, the REST API has a ridiculously complicated query parameter setup that lets callers define nearly-arbitrary filtering, sorting, and GraphQL-ish selection of what fields to return. The client-side logic to manage most of that query params setup was one of the most complicated and bug-ridden portions of our codebase.

In contrast, RTKQ has made data fetching and caching trivial. The actual data fetching code is now just "define an endpoint, import the hook into a component, and call it with the arguments". We were also able to rework and drastically simplify the query params generation logic as part of the migration, and we feed that into the useDashboardQuery() hook as part of its cache arg.

Because all RTKQ data is loaded as Redux actions, we've been able to handle those in other slices as well and pull out some data for additional processing, like recursing through fetched items to extract comments and store them in a normalized lookup table with RTK's createEntityAdapter.

Besides RTKQ, we've also used standard RTK slices as part of several pages, and that's gone beautifully as well. Complex slice reducers, multiple reducers handling actions, viewing state in the Redux DevTools... frankly it's fun to actually get to use RTK to build something, not just write tutorials about it :)

We did finally update the "editing UI" feature to pull its data from the actual Redux store instead of a useReducer hook. We had unintentionally gone with something of a "container/presentational"-ish pattern, where the root component made a bunch of const updateX = (arg) => dispatch(thingUpdated(arg)) handler functions and passed them down as props. When it came time to swap from useReducer to a Redux store, we didn't even have to change those lines :) We just replaced the useReducer() with useSelector() + useDispatch(), and it all Just Worked.

Test Setup Improvements 🔗︎

The JS build tooling ecosystem has been on fire lately, with tools like ESBuild, SWC, and Parcel using native-compiled languages like Rust and Go to help enable faster JS project build speeds.

I wanted to play around with some of those and see if we could get our own tests to run faster. I was able to use https://github.com/Brooooooklyn/swc-node and its @swc-node/jest package to replace use of Babel in our tests, and gained a moderate speedup in the process.

For our new client unit tests, we're using Jest and @testing-library/react, with miragejs for mock APIs. (I tried using msw for mock APIs a couple times since we use those for RTK Query's own tests, but ran into a couple issues and punted on that.) I also updated our Cypress test setup to use @testing-library/cypress so we could leverage the same DOM query APIs in all our tests.

Given that our Cypress tests only run nightly, I wanted to do some "integration"-style tests of entire Next pages at once, so that we could run those as part of our standard unit test suite in PRs. I found a great library called https://github.com/toomuchdesign/next-page-tester , which mimics much of Next's page loading process to let you load an entire page at a time into a unit test based on a route. You can then test out the whole page like any other component test with RTL.

Unfortunately, next-page-tester development currently appears to be stalled. It's always relied on Next internals, and the recent changes in Next 12 have apparently broken it. The author also isn't currently working on a Next-based project anyway. There's an open issue thread discussing if it's feasible to get Next to upstream some of the changes needed to keep next-page-tester working, and Lee Robinson from Vercel has responded, but as I write this it's unclear if it will be able to work in the future.

Completing the App Client Migration 🔗︎

Over Q2 and Q3 2021, we migrated our remaining pages over to the Next client. In the process, we also reviewed several existing features and pages that appeared to be no longer relevant, and got signoff from business stakeholders that we could just drop those entirely. I promptly updated the legacy client to stop showing the buttons or tabs to view those pages, and we simply did not migrate them to the new codebase.

By Q4, the only remaining work left to migrate was the "outer app shell": header bar, routes and page navigation, auth, and home page. I did a quick spike on a Friday afternoon to see how much I could get started, and to my very pleasant surprise was able to whip up several key pieces in just a few hours. I got the backend Express app to switch to only serving the new client based on a feature flag, added a new React-Bootstrap header bar, and even got basic signin/signout working (thanks to the already-implemented RTKQ endpoints for working with fetching a user).

Over the last couple months my team pulled together the final pieces, and on Monday Dec 20 I hit the button to finally turn on the "new client only" flag and deploy the build to production!

Future Cleanup 🔗︎

Now that we've shipped the new client, we can finally take some time to catch our breath and clean things up.

I already filed one PR last year that nuked 45,000 lines of dead app and test code left over from the original REST API 1.0 implementation. Over the next few weeks, we'll do the same thing to the legacy client code, as well as cleaning up our dead feature flags.

From there, I've got plans to do some much-needed cleanup and refactoring of the backend. The remaining Express API endpoints need to be converted to TS; there's a bunch of patterns like mutating the req that I want to completely rethink and rearchitect, and there's other dead code in the backend that can go away. For that matter, only about half of our server tests have been converted from nested callbacks and should over to async/await and expect(), and all of those are still plain JS. So, plenty of future cleanup work to keep us occupied :) (and somewhere in there I suppose we'll have to actually work on some business-related functionality too...)

Final Thoughts 🔗︎

As you can tell, migrating this codebase has been a very busy and complicated effort :) I'm really proud of several things:

That we were able to do the entire migration incrementally
That we continued delivering "business value" over time (I hate that phrase, but hey, the suits love it!)
That my team was able to learn and become proficient with React, Redux, and TS
That we successfully hit all our milestones throughout this process, both self-assigned and external (including a couple ridiculously short deadlines that were dropped on us without warning)

My team has been pretty small (4-5 people most of the way). Everyone's been involved throughout the whole process, and everyone's touched all areas of the codebase (legacy client, new client, backend, core logic, and tests). I really love working with this team, and it's been so satisfying for me to see them all learn, grow, and get things done. (I know there's a lot of "I did..." in this post, and I don't want to make it sound like I was the only one working on this. I did deliberately tackle a lot of the migration work myself, because it was more complicated and also more boring, so that I could enable the team to make progress on our other tasks.)

And, if I can brag on myself ever so slightly: that every single technical and architectural decision I made on this project has worked out as I expected and paid off down the road in helping us build a more maintainable codebase :)

This is a post in the Codebase Conversion series. Other posts in this series:

Dec 24, 2021 - Codebase Conversion: Migrating a MEAN AngularJS app to React, Next.js, and TypeScript
Mar 28, 2020 - Codebase Conversion: Building a MEAN.js AngularJS Project with Create-React-App
Nov 17, 2018 - Rewriting Your Git History and JS Source for Fun and Profit
Jul 04, 2017 - Integrating React and Redux Into an Existing Backbone App