Run NPM Install on All Subdirectories Containing Packages
Let us say that you have inherited an application deployment consisting of a directory tree containing multiple Node.js packages in various scattered locations. As a part of the setup you want to run npm install for each package individually, but the packages and locations have varied over time in the past and will continue to do so in the future. So you want to write a single script now that will walk through the directory tree, identify Node.js packages, and run the NPM installation where needed. Thus even if packages move around the script will not need any further update.
The first tool to reach for in this sort of situation is some combination of find to identify package.json files followed by piping the resulting list through xargs to run the actual installation process for each of the relevant directories. Simple enough, right? As always the devil is in the details.
Some of the Details
Make Sure You Run NPM in Bash
Firstly note that NPM only works robustly in bash rather than sh. If you think you are using sh and find NPM working for you then it is most likely because sh is symlinked to bash in your Unix variant. This can be an issue when running scripts in some provisioning systems, such as Puppet.
Maintain Distinct Cache Directories
Perhaps the most important issue is that running NPM installations concurrently will cause errors if more than one process is using the same cache directory. You must specify unique cache directories for each process - and clean them up afterwards if you don't want your system to run out of disk space in due course.
Error Code 255 and Xargs
If setting up commands to run via xargs, they will all run regardless of exit code unless the code is 255, in which case all remaining items will be skipped. However the only way to make most commands exit with a code of 255 is to wrap them in a script. This means we need at least two scripts here, one to run xargs and one to run npm install and return 255 on failure.
Use of -print0 and -0
Piping output from find to xargs can be sabotaged by filenames containing various meaningful characters, such as spaces and commas. The way to armor against this is to run find with the -print0 option and xargs with the -0 option. You'll find this noted in the xargs man page.
The Actual Scripts
But enough of the caveats and on with the show. The approach here splits things into two scripts, one to prepare the ground and run the find and xargs combination, and the other to run the actual process of installation with NPM. Both scripts should be in the same directory.
npm-install-for-subdirectories.sh
#!/bin/bash # # Run npm install for all Node.js packages under a given directory. # # Usage: $0 <directory> # set -o nounset set -o errexit # Check the input. if [ "$#" -lt 1 ]; then echo "Usage: $0 <directory>" exit 1 fi # The absolute path of the directory containing this script. SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd)" # The directory holding the Node.js packages of interest. DIR="$1" # Number of processors. We want to run NPM installations concurrently since # they can be time-consuming. NPROC=`nproc` # First clean out any existing installed modules. We don't want find picking up # all of their package.json files. # # The use of -print0 in find and -0 for xargs is good practice as it prevents # paths containing meaningful characters such as spaces and commas from # messing things up. find "${DIR}" -name "node_modules" -print0 | xargs -0 --max-procs=${NPROC} rm -Rf # Then run npm install in every directory with a package.json file. This probably # requires a little explanation, so an annotated version precedes the actual # command. # # # Obtain a set of absolute paths to package.json files. See the above note on # # use of -print0. # find "${DIR}" -name "package.json" -print0 | # # Now strip the /package.json part of each path to leave just the directory. # # Note that this is all one line so far as sed is concerned, since -print0 # # uses null separators, so we have to specify the global flag. # sed s,/package.json,,g | # # Next feed the list of absolute paths into xargs and run a command for # # each of them. # xargs # # See the above note on the use of -0. # -0 # # Run these processes concurrently, with a limit equal to the number of # # cores on this machine. # --max-procs=${NPROC} # # Replace % in the following command with the path passed to xargs. # -I % # # Run the quoted bash command. # bash -c "${SCRIPT_DIR}/npm-install-called-by-xargs.sh %" # find "${DIR}" -name "package.json" -print0 | sed s,/package.json,,g | xargs -0 --max-procs=${NPROC} -I % bash -c "${SCRIPT_DIR}/npm-install-called-by-xargs.sh %"
npm-install-called-by-xargs.sh
#!/bin/bash # # Run an NPM installation in a particular directory. # # Usage: $0 <directory> # # This is intended to be invoked by xargs, where instances of this script may # run concurrently for each directory in a project that requires NPM installation. # The goal is for xargs to halt immediately on failure of any one process, # however, and this requires returning an exit code of 255 on failure. # # ---------------------------------------------------------------------------- # Error handling. # ---------------------------------------------------------------------------- set -o errexit set -o nounset # Exit on error. function handleError() { local LINE="$1" local MESSAGE="${2:-}" echo "Error on or near line ${LINE}${2:+: }${MESSAGE:-}." # To make xargs halt immediately, exit with a code of 255. exit 255 } trap 'handleError ${LINENO}' ERR # ---------------------------------------------------------------------------- # Manage arguments and variables. # ---------------------------------------------------------------------------- if [ "$#" -ne "1" ]; then handleError "Usage: $0 <directory>" fi # The directory in which to run npm install. PACKAGE_DIR=$1 # NPM processes running concurrently will tend to error since they use the # same cache folder. We have to ensure they use different cache directories # by providing something unique to the --cache=/path option. CACHE_UUID=`uuidgen` CACHE_DIR="/tmp/${CACHE_UUID}" # ---------------------------------------------------------------------------- # Run the NPM installation. # ---------------------------------------------------------------------------- cd "${PACKAGE_DIR}" # Delete any old modules first, just to be safe, even though that should have # happened outside this script, prior to xargs being called. rm -Rf "${PACKAGE_DIR}/node_modules" # Finally we get to the installation. npm install --cache="${CACHE_DIR}" --loglevel=info # We need to ensure that cache directories are cleaned to keep disk # utilization low, e.g. on a build server where this might run scores of times. npm cache clean --cache="${CACHE_DIR}"